Image Retrieval with Multi-Modal Query

9 papers with code • 3 benchmarks • 2 datasets

The problem of retrieving images from a database based on a multi-modal (image- text) query. Specifically, the query text prompts some modification in the query image and the task is to retrieve images with the desired modifications.

Benchmarks

Add a Result

These leaderboards are used to track progress in Image Retrieval with Multi-Modal Query

Dataset	Best Model	Compare
Fashion200k	Css-Net	See all
MIT-States	ComposeAE	See all
FashionIQ	ComposeAE	See all

Datasets

Subtasks

Most implemented papers

Most implemented Social Latest No code

Show and Tell: A Neural Image Caption Generator

karpathy/neuraltalk • • CVPR 2015

Experiments on several datasets show the accuracy of the model and the fluency of the language it learns solely from image descriptions.

Paper
Code

A simple neural network module for relational reasoning

kimhc6028/relational-networks • • NeurIPS 2017

Relational reasoning is a central component of generally intelligent behavior, but has proven difficult for neural networks to learn.

Paper
Code

FiLM: Visual Reasoning with a General Conditioning Layer

ethanjperez/film • • 22 Sep 2017

We introduce a general-purpose conditioning method for neural networks called FiLM: Feature-wise Linear Modulation.

Paper
Code

Composing Text and Image for Image Retrieval - An Empirical Odyssey

google/tirg • • CVPR 2019

In this paper, we study the task of image retrieval, where the input query is specified in the form of an image plus some text that describes desired modifications to the input image.

Paper
Code

Image Question Answering using Convolutional Neural Network with Dynamic Parameter Prediction

HyeonwooNoh/DPPnet • • CVPR 2016

We tackle image question answering (ImageQA) problem by learning a convolutional neural network (CNN) with a dynamic parameter layer whose weights are determined adaptively based on questions.

Paper
Code

Automatic Spatially-aware Fashion Concept Discovery

naver/artemis • • ICCV 2017

This paper proposes an automatic spatially-aware concept discovery approach using weakly labeled image-text data from shopping websites.

Paper
Code

Attributes as Operators: Factorizing Unseen Attribute-Object Compositions

Tushar-N/attributes-as-operators • • ECCV 2018

In addition, we show that not only can our model recognize unseen compositions robustly in an open-world setting, it can also generalize to compositions where objects themselves were unseen during training.

Paper
Code

Compositional Learning of Image-Text Query for Image Retrieval

ecom-research/ComposeAE • • 19 Jun 2020

In this paper, we investigate the problem of retrieving images from a database based on a multi-modal (image-text) query.

Paper
Code

Composed Image Retrieval with Text Feedback via Multi-grained Uncertainty Regularization

Monoxide-Chen/uncertainty_retrieval • • 14 Nov 2022

The key idea underpinning the proposed method is to integrate fine- and coarse-grained retrieval as matching data points with small and large fluctuations, respectively.

Paper
Code

Image Retrieval with Multi-Modal Query

Benchmarks Add a Result

Datasets

Subtasks

Most implemented papers

Content

Benchmarks

Add a Result