Image Retrieval
665 papers with code • 54 benchmarks • 75 datasets
Image Retrieval is a fundamental and long-standing computer vision task that involves finding images similar to a provided query from a large database. It's often considered as a form of fine-grained, instance-level classification. Not just integral to image recognition alongside classification and detection, it also holds substantial business value by helping users discover images aligning with their interests or requirements, guided by visual similarity or other parameters.
( Image credit: DELF )
Libraries
Use these libraries to find Image Retrieval models and implementationsDatasets
Subtasks
Most implemented papers
Learning Deep Representations of Fine-grained Visual Descriptions
State-of-the-art methods for zero-shot visual recognition formulate learning as a joint embedding problem of images and side information.
Looking at Outfit to Parse Clothing
This paper extends fully-convolutional neural networks (FCN) for the clothing parsing problem.
Combination of Multiple Global Descriptors for Image Retrieval
Recent studies in image retrieval task have shown that ensembling different models and combining multiple global descriptors lead to performance improvement.
Particular object retrieval with integral max-pooling of CNN activations
Recently, image representation built upon Convolutional Neural Network (CNN) has been shown to provide effective descriptors for image search, outperforming pre-CNN features as short-vector representations.
Stacked Cross Attention for Image-Text Matching
Prior work either simply aggregates the similarity of all possible pairs of regions and words without attending differentially to more and less important words or regions, or uses a multi-step attentional process to capture limited number of semantic alignments which is less interpretable.
CNN Image Retrieval Learns from BoW: Unsupervised Fine-Tuning with Hard Examples
Convolutional Neural Networks (CNNs) achieve state-of-the-art performance in many computer vision tasks.
Sampling Matters in Deep Embedding Learning
In addition, we show that a simple margin based loss is sufficient to outperform all other loss functions.
Batch DropBlock Network for Person Re-identification and Beyond
In this paper, we propose the Batch DropBlock (BDB) Network which is a two branch network composed of a conventional ResNet-50 as the global branch and a feature dropping branch.
SoftTriple Loss: Deep Metric Learning Without Triplet Sampling
The set of triplet constraints has to be sampled within the mini-batch.
12-in-1: Multi-Task Vision and Language Representation Learning
Much of vision-and-language research focuses on a small but diverse set of independent tasks and supporting datasets often studied in isolation; however, the visually-grounded language understanding skills required for success at these tasks overlap significantly.