A Novel Self-Supervised Cross-Modal Image Retrieval Method In Remote Sensing

23 Feb 2022  ·  Gencer Sumbul, Markus Müller, Begüm Demir ·

Due to the availability of multi-modal remote sensing (RS) image archives, one of the most important research topics is the development of cross-modal RS image retrieval (CM-RSIR) methods that search semantically similar images across different modalities. Existing CM-RSIR methods require the availability of a high quality and quantity of annotated training images. The collection of a sufficient number of reliable labeled images is time consuming, complex and costly in operational scenarios, and can significantly affect the final accuracy of CM-RSIR. In this paper, we introduce a novel self-supervised CM-RSIR method that aims to: i) model mutual-information between different modalities in a self-supervised manner; ii) retain the distributions of modal-specific feature spaces similar to each other; and iii) define the most similar images within each modality without requiring any annotated training image. To this end, we propose a novel objective including three loss functions that simultaneously: i) maximize mutual information of different modalities for inter-modal similarity preservation; ii) minimize the angular distance of multi-modal image tuples for the elimination of inter-modal discrepancies; and iii) increase cosine similarity of the most similar images within each modality for the characterization of intra-modal similarities. Experimental results show the effectiveness of the proposed method compared to state-of-the-art methods. The code of the proposed method is publicly available at https://git.tu-berlin.de/rsim/SS-CM-RSIR.

PDF Abstract
No code implementations yet. Submit your code now

Datasets


  Add Datasets introduced or used in this paper

Results from the Paper


  Submit results from this paper to get state-of-the-art GitHub badges and help the community compare results to other papers.

Methods


No methods listed for this paper. Add relevant methods here