This paper addresses the problem of Sketch-Based Image Retrieval (SBIR), for
which bridge the gap between the data representations of sketch images and
photo images is considered as the key. Previous works mostly focus on learning
a feature space to minimize intra-class distances for both sketches and photos...
In contrast, we propose a novel loss function, named Euclidean Margin Softmax
(EMS), that not only minimizes intra-class distances but also maximizes
inter-class distances simultaneously. It enables us to learn a feature space
with high discriminability, leading to highly accurate retrieval. In addition,
this loss function is applied to a conditional network architecture, which
could incorporate the prior knowledge of whether a sample is a sketch or a
photo. We show that the conditional information can be conveniently
incorporated to the recently proposed Squeeze and Excitation (SE) module, lead
to a conditional SE (CSE) module. Extensive experiments are conducted on two
widely used SBIR benchmark datasets. Our approach, although being very simple,
achieved new state-of-the-art on both datasets, surpassing existing methods by
a large margin.