Cross-Modal Hamming Hashing

ECCV 2018  ·  Yue Cao , Bin Liu, Mingsheng Long, Jian-Min Wang ·

Cross-modal hashing enables similarity retrieval across different content modalities, such as searching relevant images in response to text queries. It provides with the advantages of computation efficiency and retrieval quality for multimedia retrieval. Hamming space retrieval enables efficient constant-time search that returns data items within a given Hamming radius to each query, by hash lookups instead of linear scan. However, Hamming space retrieval is ineffective in existing cross-modal hashing methods, subject to their weak capability of concentrating the relevant items to be within a small Hamming ball, while worse still, the Hamming distances between hash codes from different modalities are inevitably large due to the large heterogeneity across different modalities. This work presents Cross-Modal Hamming Hashing (CMHH), a novel deep cross-modal hashing approach that generates compact and highly concentrated hash codes to enable efficient and effective Hamming space retrieval. The main idea is to penalize significantly on similar cross-modal pairs with Hamming distance larger than the Hamming radius threshold, by designing a pairwise focal loss based on the exponential distribution. Extensive experiments demonstrate that CMHH can generate highly concentrated hash codes and achieve state-of-the-art cross-modal retrieval performance for both hash lookups and linear scan scenarios on three benchmark datasets, NUS-WIDE, MIRFlickr-25K, and IAPR TC-12.

PDF Abstract

Datasets


  Add Datasets introduced or used in this paper

Results from the Paper


  Submit results from this paper to get state-of-the-art GitHub badges and help the community compare results to other papers.

Methods