Image Retrieval

666 papers with code • 54 benchmarks • 75 datasets

Image Retrieval is a fundamental and long-standing computer vision task that involves finding images similar to a provided query from a large database. It's often considered as a form of fine-grained, instance-level classification. Not just integral to image recognition alongside classification and detection, it also holds substantial business value by helping users discover images aligning with their interests or requirements, guided by visual similarity or other parameters.

( Image credit: DELF )

Benchmarks

Add a Result

These leaderboards are used to track progress in Image Retrieval

Dataset	Best Model	Compare
ROxford (Medium)	Hypergraph propagation+Community selection	See all
RParis (Medium)	Hypergraph propagation	See all
ROxford (Hard)	SuperGlobal	See all
RParis (Hard)	SuperGlobal	See all
CREPE (Compositional REPresentation Evaluation)	ViT-L-14 (LAION400M)	See all
Flickr30K 1K test	X-VLM (base)	See all
Fashion IQ	SPRC	See all
SOP	Unicom+ViT-L@336px	See all
Oxf5k	Offline Diffusion	See all
Flickr30k-CN	InternVL-G-FT	See all
CIRR	SPRC	See all
iNaturalist	Unicom+ViT-L@336px	See all
Oxf105k	Offline Diffusion	See all
MUGE Retrieval	CN-CLIP (ViT-H/14)	See all
COCO-CN	CN-CLIP (ViT-H/14)	See all
CUB-200-2011	CGD (MG/SG)	See all
CARS196	CGD (MG/SG)	See all
Par6k	Offline Diffusion	See all
Par106k	Offline Diffusion	See all
In-Shop	CGD (SG/GS)	See all
Flickr30k	BLIP-2 ViT-G (zero-shot, 1K test set)	See all
MS COCO	BLIP-2 ViT-G (fine-tuned)	See all
AmsterTime	DINOv2 distilled (ViT-L/14 frozen)	See all
PhotoChat	PaCE	See all
ConQA Descriptive	CLIP	See all
ConQA Conceptual	CLIP	See all
DeepFashion - Consumer-to-shop	CTL Model (ResNet50-IBN-A, 320x320)	See all
Exact Street2Shop	CTL Model (ResNet50-IBN-A, 320x320)	See all
LaSCo	CASE	See all
DeepPatent	SwinV2	See all
24/7 Tokyo	HED-N-GAN	See all
street2shop - topwear	Ranknet	See all
INRIA Holidays	MultiGrain R50 @ 800	See all
Paris6k	IME layer	See all
Oxford5k	GNN-Reranking	See all
AIC-ICC	ERNIE-ViL2.0	See all
WIT	WIT-ALL	See all
CBVS	UniCLP	See all
NUS-WIDE	LESA	See all
DeepFashion	STIR	See all
Google Landmarks Dataset v2 (retrieval, testing)	ResNet101+ArcFace GLDv2-train-clean	See all
Google Landmarks Dataset v2 (retrieval, validation)	ResNet101+ArcFace GLDv2-train-clean	See all
INSTRE	IME layer	See all
CIFAR-10	Custom: 3 conv + 2 fcn	See all
ImageCoDe	ContextualCLIP	See all
PKU-Reid	IHDA	See all
PKU SketchRe-ID Dataset	IHDA	See all
FETA Car-Manuals	FETA's CLIP-MIL (Many-Shot Image-to-text)	See all
FooDI-ML (Global)	ADAPT-I2T	See all
FooDI-ML (Spain)	ADAPT-I2T	See all
Localized Narratives	OPT	See all
ICFG-PEDES	SSAN	See all
RUC-CAS-WenLan	CMCL	See all
ROxford Medium without fine-tuning	HesAff–rSIFT–VLAD	See all

Show all 54 benchmarks

Collapse benchmarks

Libraries

Use these libraries to find Image Retrieval models and implementations

huggingface/transformers

4 papers

125,059

OML-Team/open-metric-learning

4 papers

762

kornia/kornia

2 papers

9,384

salesforce/lavis

2 papers

8,724

See all 10 libraries.

Datasets

Subtasks

Medical Image Retrieval

Multi-Label Image Retrieval

Face Image Retrieval

Video-to-Shop

Image Instance Retrieval

Semi-Supervised Sketch Based Image Retrieval

Chat-based Image Retrieval

Latest papers

Most implemented Social Latest No code

Hierarchical localization with panoramic views and triplet loss functions

marcosalfaro/tripletnetworksindoorlocalization • 22 Apr 2024

The experimental section proves that triplet neural networks are an efficient and robust tool to address the localization of mobile robots in indoor environments, considering real operation conditions.

22 Apr 2024

Paper
Code

Semantically-correlated memories in a dense associative model

faceonlive/ai-research • 10 Apr 2024

I introduce a novel associative memory model named Correlated Dense Associative Memory (CDAM), which integrates both auto- and hetero-association in a unified framework for continuous-valued memory patterns.

152

10 Apr 2024

Paper
Code

Weakly Supervised Deep Hyperspherical Quantization for Image Retrieval

faceonlive/ai-research • Proceedings of the AAAI Conference on Artificial Intelligence 2021

Deep quantization methods have shown high efficiency on large-scale image retrieval.

152

07 Apr 2024

Paper
Code

On Train-Test Class Overlap and Detection for Image Retrieval

dealicious-inc/rgldv2-clean • 1 Apr 2024

How important is it for training and evaluation sets to not have class overlap in image retrieval?

01 Apr 2024

Paper
Code

Long-CLIP: Unlocking the Long-Text Capability of CLIP

beichenzbc/long-clip • • 22 Mar 2024

Contrastive Language-Image Pre-training (CLIP) has been the cornerstone for zero-shot classification, text-image retrieval, and text-image generation by aligning image and text modalities.

307

22 Mar 2024

Paper
Code

Enhancing Historical Image Retrieval with Compositional Cues

linty5/ccbir • • 21 Mar 2024

In analyzing vast amounts of digitally stored historical image data, existing content-based retrieval methods often overlook significant non-semantic information, limiting their effectiveness for flexible exploration across varied themes.

21 Mar 2024

Paper
Code

MindEye2: Shared-Subject Models Enable fMRI-To-Image With 1 Hour of Data

medarc-ai/mindeyev2 • • 17 Mar 2024

Reconstructions of visual perception from brain activity have improved tremendously, but the practical utility of such methods has been limited.

17 Mar 2024

Paper
Code

Leveraging Neural Radiance Field in Descriptor Synthesis for Keypoints Scene Coordinate Regression

ais-lab/descriptorsynthesis4feat2map • 15 Mar 2024

Classical structural-based visual localization methods offer high accuracy but face trade-offs in terms of storage, speed, and privacy.

15 Mar 2024

Paper
Code

Does the Performance of Text-to-Image Retrieval Models Generalize Beyond Captions-as-a-Query?

AU-DIS/ConQA • • European Conference on Information Retrieval 2024

ConQA comprises 30 descriptive and 50 conceptual queries on 43k images with more than 100 manually annotated images per query.

15 Mar 2024

Paper
Code

PAPERCLIP: Associating Astronomical Observations and Natural Language with Multi-Modal Models

smsharma/paperclip-hubble • • 13 Mar 2024

We present PAPERCLIP (Proposal Abstracts Provide an Effective Representation for Contrastive Language-Image Pre-training), a method which associates astronomical observations imaged by telescopes with natural language using a neural network model.

13 Mar 2024

Paper
Code

Image Retrieval

Benchmarks Add a Result

Libraries

Datasets

Subtasks

Latest papers

Content

Benchmarks

Add a Result