Instance Search

9 papers with code • 0 benchmarks • 1 datasets

Visual Instance Search is the task of retrieving from a database of images the ones that contain an instance of a visual query. It is typically much more challenging than finding images from the database that contain objects belonging to the same category as the object in the query. If the visual query is an image of a shoe, visual Instance Search does not try to find images of shoes, which might differ from the query in shape, color or size, but tries to find images of the exact same shoe as the one in the query image. Visual Instance Search challenges image representations as the features extracted from the images must enable such fine-grained recognition despite variations in viewpoints, scale, position, illumination, etc. Whereas holistic image representations, where each image is mapped to a single high-dimensional vector, are sufficient for coarse-grained similarity retrieval, local features are needed for instance retrieval.

Source: Dynamicity and Durability in Scalable Visual Instance Search

Datasets


Latest papers with no code

CoVLR: Coordinating Cross-Modal Consistency and Intra-Modal Structure for Vision-Language Retrieval

no code yet • 15 Apr 2023

To address this challenge, in this paper, we experimentally observe that the vision-language divergence may cause the existence of strong and weak modalities, and the hard cross-modal consistency cannot guarantee that strong modal instances' relationships are not affected by weak modality, resulting in the strong modal instances' relationships perturbed despite learned consistent representations. To this end, we propose a novel and directly Coordinated VisionLanguage Retrieval method (dubbed CoVLR), which aims to study and alleviate the desynchrony problem between the cross-modal alignment and single-modal cluster-preserving tasks.

whu-nercms at trecvid2021:instance search task

no code yet • 30 Oct 2021

We will make a brief introduction of the experimental methods and results of the WHU-NERCMS in the TRECVID2021 in the paper.

OVIS: Open-Vocabulary Visual Instance Search via Visual-Semantic Aligned Representation Learning

no code yet • 8 Aug 2021

We introduce the task of open-vocabulary visual instance search (OVIS).

Towards Accurate Localization by Instance Search

no code yet • 11 Jul 2021

In this paper, a self-paced learning framework is proposed to achieve accurate object localization on the rank list returned by instance search.

TRECVID 2020: A comprehensive campaign for evaluating video retrieval tasks across multiple application domains

no code yet • 27 Apr 2021

In total, 29 teams from various research organizations worldwide completed one or more of the following six tasks: 1.

Deep Learning for Instance Retrieval: A Survey

no code yet • 27 Jan 2021

In recent years a vast amount of visual content has been generated and shared from many fields, such as social media platforms, medical imaging, and robotics.

TRECVID 2019: An Evaluation Campaign to Benchmark Video Activity Detection, Video Captioning and Matching, and Video Search & Retrieval

no code yet • 21 Sep 2020

The TREC Video Retrieval Evaluation (TRECVID) 2019 was a TREC-style video analysis and retrieval evaluation, the goal of which remains to promote progress in research and development of content-based exploitation and retrieval of information from digital video via open, metrics-based evaluation.

Deeply Activated Salient Region for Instance Search

no code yet • 1 Feb 2020

Due to the lack of proper mechanism in locating instances and deriving feature representation, instance search is generally only effective for retrieving instances of known object categories.

Compressive Quantization for Fast Object Instance Search in Videos

no code yet • ICCV 2017

Most of current visual search systems focus on image-to-image (point-to-point) search such as image and object retrieval.

Learning Non-Metric Visual Similarity for Image Retrieval

no code yet • ICLR 2018

Theoretically, non-metric distances are able to generate a more complex and accurate similarity model than metric distances, provided that the non-linear data distribution is precisely captured by the system.