Image Retrieval with Multi-Modal Query

9 papers with code • 3 benchmarks • 2 datasets

The problem of retrieving images from a database based on a multi-modal (image- text) query. Specifically, the query text prompts some modification in the query image and the task is to retrieve images with the desired modifications.

Composed Image Retrieval with Text Feedback via Multi-grained Uncertainty Regularization

Monoxide-Chen/uncertainty_retrieval 14 Nov 2022

The key idea underpinning the proposed method is to integrate fine- and coarse-grained retrieval as matching data points with small and large fluctuations, respectively.

50
14 Nov 2022

Compositional Learning of Image-Text Query for Image Retrieval

ecom-research/ComposeAE 19 Jun 2020

In this paper, we investigate the problem of retrieving images from a database based on a multi-modal (image-text) query.

54
19 Jun 2020

Composing Text and Image for Image Retrieval - An Empirical Odyssey

google/tirg CVPR 2019

In this paper, we study the task of image retrieval, where the input query is specified in the form of an image plus some text that describes desired modifications to the input image.

295
18 Dec 2018

Attributes as Operators: Factorizing Unseen Attribute-Object Compositions

Tushar-N/attributes-as-operators ECCV 2018

In addition, we show that not only can our model recognize unseen compositions robustly in an open-world setting, it can also generalize to compositions where objects themselves were unseen during training.

64
27 Mar 2018

FiLM: Visual Reasoning with a General Conditioning Layer

ethanjperez/film 22 Sep 2017

We introduce a general-purpose conditioning method for neural networks called FiLM: Feature-wise Linear Modulation.

296
22 Sep 2017

Automatic Spatially-aware Fashion Concept Discovery

naver/artemis ICCV 2017

This paper proposes an automatic spatially-aware concept discovery approach using weakly labeled image-text data from shopping websites.

45
03 Aug 2017

A simple neural network module for relational reasoning

kimhc6028/relational-networks NeurIPS 2017

Relational reasoning is a central component of generally intelligent behavior, but has proven difficult for neural networks to learn.

806
05 Jun 2017

Image Question Answering using Convolutional Neural Network with Dynamic Parameter Prediction

HyeonwooNoh/DPPnet CVPR 2016

We tackle image question answering (ImageQA) problem by learning a convolutional neural network (CNN) with a dynamic parameter layer whose weights are determined adaptively based on questions.

94
18 Nov 2015

Show and Tell: A Neural Image Caption Generator

karpathy/neuraltalk CVPR 2015

Experiments on several datasets show the accuracy of the model and the fluency of the language it learns solely from image descriptions.

5,378
17 Nov 2014