TASK	DATASET	MODEL	METRIC NAME	METRIC VALUE	GLOBAL RANK
Image Retrieval with Multi-Modal Query	Fashion200k	ComposeAE	Recall@1	22.8	# 2
Image Retrieval with Multi-Modal Query	Fashion200k	ComposeAE	Recall@10	55.3	# 1
Image Retrieval with Multi-Modal Query	Fashion200k	ComposeAE	Recall@50	73.4	# 1
Image Retrieval with Multi-Modal Query	FashionIQ	ComposeAE	Recall@10	11.8	# 1
Image Retrieval	Fashion IQ	ComposeAE	(Recall@10+Recall@50)/2	20.6	# 18
Image Retrieval with Multi-Modal Query	MIT-States	ComposeAE	Recall@1	13.9	# 1
Image Retrieval with Multi-Modal Query	MIT-States	ComposeAE	Recall@5	35.5	# 1
Image Retrieval with Multi-Modal Query	MIT-States	ComposeAE	Recall@10	47.9	# 1

Badge	Markdown
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/compositional-learning-of-image-text-query/image-retrieval-with-multi-modal-query-on-1)](https://paperswithcode.com/sota/image-retrieval-with-multi-modal-query-on-1?p=compositional-learning-of-image-text-query)`
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/compositional-learning-of-image-text-query/image-retrieval-with-multi-modal-query-on-mit)](https://paperswithcode.com/sota/image-retrieval-with-multi-modal-query-on-mit?p=compositional-learning-of-image-text-query)`
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/compositional-learning-of-image-text-query/image-retrieval-with-multi-modal-query-on)](https://paperswithcode.com/sota/image-retrieval-with-multi-modal-query-on?p=compositional-learning-of-image-text-query)`
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/compositional-learning-of-image-text-query/image-retrieval-on-fashion-iq)](https://paperswithcode.com/sota/image-retrieval-on-fashion-iq?p=compositional-learning-of-image-text-query)`

Compositional Learning of Image-Text Query for Image Retrieval

19 Jun 2020 · Muhammad Umer Anwaar, Egor Labintcev, Martin Kleinsteuber ·

In this paper, we investigate the problem of retrieving images from a database based on a multi-modal (image-text) query. Specifically, the query text prompts some modification in the query image and the task is to retrieve images with the desired modifications. For instance, a user of an E-Commerce platform is interested in buying a dress, which should look similar to her friend's dress, but the dress should be of white color with a ribbon sash. In this case, we would like the algorithm to retrieve some dresses with desired modifications in the query dress. We propose an autoencoder based model, ComposeAE, to learn the composition of image and text query for retrieving images. We adopt a deep metric learning approach and learn a metric that pushes composition of source image and text query closer to the target images. We also propose a rotational symmetry constraint on the optimization problem. Our approach is able to outperform the state-of-the-art method TIRG \cite{TIRG} on three benchmark datasets, namely: MIT-States, Fashion200k and Fashion IQ. In order to ensure fair comparison, we introduce strong baselines by enhancing TIRG method. To ensure reproducibility of the results, we publish our code here: \url{https://github.com/ecom-research/ComposeAE}.

PDF Abstract

Code

Add Remove Mark official

ecom-research/ComposeAE official

Tasks

Add Remove

Image Retrieval

Image Retrieval with Multi-Modal Query

Metric Learning

Retrieval

Datasets

MIT-States Fashion IQ

Results from the Paper

Edit

Ranked #1 on Image Retrieval with Multi-Modal Query on FashionIQ

Get a GitHub badge

Task	Dataset	Model	Metric Name	Metric Value	Global Rank	Benchmark
Image Retrieval with Multi-Modal Query	Fashion200k	ComposeAE	Recall@1	22.8	# 2	Compare
			Recall@10	55.3	# 1	Compare
			Recall@50	73.4	# 1	Compare
Image Retrieval with Multi-Modal Query	FashionIQ	ComposeAE	Recall@10	11.8	# 1	Compare
Image Retrieval	Fashion IQ	ComposeAE	(Recall@10+Recall@50)/2	20.6	# 18	Compare
Image Retrieval with Multi-Modal Query	MIT-States	ComposeAE	Recall@1	13.9	# 1	Compare
			Recall@5	35.5	# 1	Compare
			Recall@10	47.9	# 1	Compare

Methods

Add Remove

AutoEncoder

Edit Social Preview

Compositional Learning of Image-Text Query for Image Retrieval

Code Edit Add Remove Mark official

Tasks Edit Add Remove

Datasets Edit

Results from the Paper Edit

Methods Edit Add Remove

Code

Add Remove Mark official

Tasks

Add Remove

Datasets

Results from the Paper

Edit

Methods

Add Remove