TASK	DATASET	MODEL	METRIC NAME	METRIC VALUE	GLOBAL RANK
Image Retrieval with Multi-Modal Query	Fashion200k	FashionConcept	Recall@1	6.3	# 8
Image Retrieval with Multi-Modal Query	Fashion200k	FashionConcept	Recall@10	19.9	# 8
Image Retrieval with Multi-Modal Query	Fashion200k	FashionConcept	Recall@50	38.3	# 8

Badge	Markdown
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/automatic-spatially-aware-fashion-concept/image-retrieval-with-multi-modal-query-on)](https://paperswithcode.com/sota/image-retrieval-with-multi-modal-query-on?p=automatic-spatially-aware-fashion-concept)`

Automatic Spatially-aware Fashion Concept Discovery

ICCV 2017 · Xintong Han, Zuxuan Wu, Phoenix X. Huang, Xiao Zhang, Menglong Zhu, Yuan Li, Yang Zhao, Larry S. Davis ·

This paper proposes an automatic spatially-aware concept discovery approach using weakly labeled image-text data from shopping websites. We first fine-tune GoogleNet by jointly modeling clothing images and their corresponding descriptions in a visual-semantic embedding space. Then, for each attribute (word), we generate its spatially-aware representation by combining its semantic word vector representation with its spatial representation derived from the convolutional maps of the fine-tuned network. The resulting spatially-aware representations are further used to cluster attributes into multiple groups to form spatially-aware concepts (e.g., the neckline concept might consist of attributes like v-neck, round-neck, etc). Finally, we decompose the visual-semantic embedding space into multiple concept-specific subspaces, which facilitates structured browsing and attribute-feedback product retrieval by exploiting multimodal linguistic regularities. We conducted extensive experiments on our newly collected Fashion200K dataset, and results on clustering quality evaluation and attribute-feedback product retrieval task demonstrate the effectiveness of our automatically discovered spatially-aware concepts.