Image Retrieval
668 papers with code • 54 benchmarks • 75 datasets
Image Retrieval is a fundamental and long-standing computer vision task that involves finding images similar to a provided query from a large database. It's often considered as a form of fine-grained, instance-level classification. Not just integral to image recognition alongside classification and detection, it also holds substantial business value by helping users discover images aligning with their interests or requirements, guided by visual similarity or other parameters.
( Image credit: DELF )
Libraries
Use these libraries to find Image Retrieval models and implementationsDatasets
Subtasks
Latest papers with no code
Spherical Linear Interpolation and Text-Anchoring for Zero-shot Composed Image Retrieval
However, we conjecture that this approach has a downside: the projection module distorts the original image representation and confines the resulting composed embeddings to the text-side.
Large Language Model Informed Patent Image Retrieval
In patent prosecution, image-based retrieval systems for identifying similarities between current patent images and prior art are pivotal to ensure the novelty and non-obviousness of patent applications.
Enhancing Interactive Image Retrieval With Query Rewriting Using Large Language Models and Vision Language Models
Our contributions encompass the development of an innovative interactive image retrieval system, the integration of an LLM-based denoiser, the curation of a meticulously designed evaluation dataset, and thorough experimental validation.
Dual-Modal Prompting for Sketch-Based Image Retrieval
In this study, we aim to tackle two major challenges of this task simultaneously: i) zero-shot, dealing with unseen categories, and ii) fine-grained, referring to intra-category instance-level retrieval.
Learning text-to-video retrieval from image captioning
In this paper, we make use of this progress and instantiate the image experts from two types of models: a text-to-image retrieval model to provide an initial backbone, and image captioning models to provide supervision signal into unlabeled videos.
Revisiting Relevance Feedback for CLIP-based Interactive Image Retrieval
However, metric learning cannot handle differences in users' preferences, and requires data to train an image encoder.
CriSp: Leveraging Tread Depth Maps for Enhanced Crime-Scene Shoeprint Matching
Moreover, all existing methods match crime-scene shoeprints to clean reference prints, yet our analysis shows matching to more informative tread depth maps yields better retrieval results.
DVF: Advancing Robust and Accurate Fine-Grained Image Retrieval with Retrieval Guidelines
This paper presents a meticulous analysis leading to the proposal of practical guidelines to identify subcategory-specific discrepancies and generate discriminative features to design effective FGIR models.
Visual Delta Generator with Large Multi-modal Models for Semi-supervised Composed Image Retrieval
Composed Image Retrieval (CIR) is a task that retrieves images similar to a query, based on a provided textual modification.
Collaborative Visual Place Recognition through Federated Learning
Visual Place Recognition (VPR) aims to estimate the location of an image by treating it as a retrieval problem.