TASK	DATASET	MODEL	METRIC NAME	METRIC VALUE	GLOBAL RANK	REMOVE
Multi-label Image Recognition with Partial Labels	MS-COCO-2014	DualCoOp+TaI-DPT	Average mAP	83.6	# 1
Multi-label Image Recognition with Partial Labels	PASCAL VOC 2007	DualCoOp+TaI-DPT	Average mAP	94.8	# 1

Badge	Markdown
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/texts-as-images-in-prompt-tuning-for-multi/multi-label-image-recognition-with-partial)](https://paperswithcode.com/sota/multi-label-image-recognition-with-partial?p=texts-as-images-in-prompt-tuning-for-multi)`
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/texts-as-images-in-prompt-tuning-for-multi/multi-label-image-recognition-with-partial-1)](https://paperswithcode.com/sota/multi-label-image-recognition-with-partial-1?p=texts-as-images-in-prompt-tuning-for-multi)`

Texts as Images in Prompt Tuning for Multi-Label Image Recognition

CVPR 2023 · Zixian Guo, Bowen Dong, Zhilong Ji, Jinfeng Bai, Yiwen Guo, WangMeng Zuo ·

Prompt tuning has been employed as an efficient way to adapt large vision-language pre-trained models (e.g. CLIP) to various downstream tasks in data-limited or label-limited settings. Nonetheless, visual data (e.g., images) is by default prerequisite for learning prompts in existing methods. In this work, we advocate that the effectiveness of image-text contrastive learning in aligning the two modalities (for training CLIP) further makes it feasible to treat texts as images for prompt tuning and introduce TaI prompting. In contrast to the visual data, text descriptions are easy to collect, and their class labels can be directly derived. Particularly, we apply TaI prompting to multi-label image recognition, where sentences in the wild serve as alternatives to images for prompt tuning. Moreover, with TaI, double-grained prompt tuning (TaI-DPT) is further presented to extract both coarse-grained and fine-grained embeddings for enhancing the multi-label recognition performance. Experimental results show that our proposed TaI-DPT outperforms zero-shot CLIP by a large margin on multiple benchmarks, e.g., MS-COCO, VOC2007, and NUS-WIDE, while it can be combined with existing methods of prompting from images to improve recognition performance further. Code is released at https://github.com/guozix/TaI-DPT.

PDF Abstract CVPR 2023 PDF CVPR 2023 Abstract

Code

Add Remove Mark official

guozix/tai-dpt official

Tasks

Add Remove

Contrastive Learning

Multi-label Image Recognition with Partial Labels

Datasets

MS COCO

NUS-WIDE

PASCAL VOC 2007

Results from the Paper

Edit

Ranked #1 on Multi-label Image Recognition with Partial Labels on PASCAL VOC 2007

Get a GitHub badge

Task	Dataset	Model	Metric Name	Metric Value	Global Rank	Result	Benchmark
Multi-label Image Recognition with Partial Labels	MS-COCO-2014	DualCoOp+TaI-DPT	Average mAP	83.6	# 1		Compare
Multi-label Image Recognition with Partial Labels	PASCAL VOC 2007	DualCoOp+TaI-DPT	Average mAP	94.8	# 1		Compare

Methods

Add Remove

CLIP • Contrastive Learning

Edit Social Preview

Texts as Images in Prompt Tuning for Multi-Label Image Recognition

Code Edit Add Remove Mark official

Tasks Edit Add Remove

Datasets Edit

Results from the Paper Edit

Methods Edit Add Remove

Code

Add Remove Mark official

Tasks

Add Remove

Datasets

Results from the Paper

Edit

Methods

Add Remove