TASK	DATASET	MODEL	METRIC NAME	METRIC VALUE	GLOBAL RANK
Open Vocabulary Semantic Segmentation	COCO-Stuff-171	POMP	HIoU	39.1	# 1
Prompt Engineering	ImageNet-21k	POMP	Accuracy	25.3	# 1
Prompt Engineering	ImageNet-A	POMP	Top-1 accuracy %	51.6	# 1
Prompt Engineering	ImageNet-R	POMP	Top-1 accuracy %	77.9	# 1
Prompt Engineering	ImageNet-S	POMP	Top-1 accuracy %	49.8	# 1
Open Vocabulary Object Detection	LVIS v1.0	POMP	AP novel-LVIS base training	25.2	# 10
Open Vocabulary Semantic Segmentation	PascalVOC-20	POMP	mIoU	89.4	# 9
Open Vocabulary Semantic Segmentation	PascalVOC-20	POMP	hIoU	84.4	# 1

Badge	Markdown
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/prompt-pre-training-with-twenty-thousand-1/open-vocabulary-semantic-segmentation-on-coco)](https://paperswithcode.com/sota/open-vocabulary-semantic-segmentation-on-coco?p=prompt-pre-training-with-twenty-thousand-1)`
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/prompt-pre-training-with-twenty-thousand-1/prompt-engineering-on-imagenet-21k)](https://paperswithcode.com/sota/prompt-engineering-on-imagenet-21k?p=prompt-pre-training-with-twenty-thousand-1)`
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/prompt-pre-training-with-twenty-thousand-1/prompt-engineering-on-imagenet-a)](https://paperswithcode.com/sota/prompt-engineering-on-imagenet-a?p=prompt-pre-training-with-twenty-thousand-1)`
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/prompt-pre-training-with-twenty-thousand-1/prompt-engineering-on-imagenet-r)](https://paperswithcode.com/sota/prompt-engineering-on-imagenet-r?p=prompt-pre-training-with-twenty-thousand-1)`
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/prompt-pre-training-with-twenty-thousand-1/prompt-engineering-on-imagenet-s)](https://paperswithcode.com/sota/prompt-engineering-on-imagenet-s?p=prompt-pre-training-with-twenty-thousand-1)`
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/prompt-pre-training-with-twenty-thousand-1/open-vocabulary-semantic-segmentation-on-5)](https://paperswithcode.com/sota/open-vocabulary-semantic-segmentation-on-5?p=prompt-pre-training-with-twenty-thousand-1)`
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/prompt-pre-training-with-twenty-thousand-1/open-vocabulary-object-detection-on-lvis-v1-0)](https://paperswithcode.com/sota/open-vocabulary-object-detection-on-lvis-v1-0?p=prompt-pre-training-with-twenty-thousand-1)`

Prompt Pre-Training with Twenty-Thousand Classes for Open-Vocabulary Visual Recognition

NeurIPS 2023 · Shuhuai Ren, Aston Zhang, Yi Zhu, Shuai Zhang, Shuai Zheng, Mu Li, Alex Smola, Xu sun ·

This work proposes POMP, a prompt pre-training method for vision-language models. Being memory and computation efficient, POMP enables the learned prompt to condense semantic information for a rich set of visual concepts with over twenty-thousand classes. Once pre-trained, the prompt with a strong transferable ability can be directly plugged into a variety of visual recognition tasks including image classification, semantic segmentation, and object detection, to boost recognition performances in a zero-shot manner. Empirical evaluation shows that POMP achieves state-of-the-art performances on 21 datasets, e.g., 67.0% average accuracy on 10 classification datasets (+3.1% compared to CoOp) and 84.4 hIoU on open-vocabulary Pascal VOC segmentation (+6.9 compared to ZSSeg). Our code is available at https://github.com/amazon-science/prompt-pretraining.

PDF Abstract NeurIPS 2023 PDF NeurIPS 2023 Abstract

Code

Add Remove Mark official

amazon-science/prompt-pretraining official

↳ Quickstart in

Colab

245

Tasks

Add Remove

Image Classification

object-detection

Object Detection

Segmentation

Semantic Segmentation

Datasets

MS COCO

UCF101

Oxford 102 Flower

ADE20K

DTD

EuroSAT

LVIS

ImageNet-R

ImageNet-A

PASCAL Context

COCO-Stuff

PASCAL VOC

ImageNet-S

Results from the Paper

Add Remove

Ranked #1 on Open Vocabulary Semantic Segmentation on COCO-Stuff-171

Get a GitHub badge

Task	Dataset	Model	Metric Name	Metric Value	Global Rank	Benchmark
Open Vocabulary Semantic Segmentation	COCO-Stuff-171	POMP	HIoU	39.1	# 1	Compare
Prompt Engineering	ImageNet-21k	POMP	Accuracy	25.3	# 1	Compare
Prompt Engineering	ImageNet-A	POMP	Top-1 accuracy %	51.6	# 1	Compare
Prompt Engineering	ImageNet-R	POMP	Top-1 accuracy %	77.9	# 1	Compare
Prompt Engineering	ImageNet-S	POMP	Top-1 accuracy %	49.8	# 1	Compare
Open Vocabulary Object Detection	LVIS v1.0	POMP	AP novel-LVIS base training	25.2	# 10	Compare
Open Vocabulary Semantic Segmentation	PascalVOC-20	POMP	mIoU	89.4	# 9	Compare
Open Vocabulary Semantic Segmentation	PascalVOC-20	POMP	hIoU	84.4	# 1	Compare

Methods

Add Remove

No methods listed for this paper. Add relevant methods here

Edit Social Preview

Prompt Pre-Training with Twenty-Thousand Classes for Open-Vocabulary Visual Recognition

Code Edit Add Remove Mark official

Tasks Edit Add Remove

Datasets Edit

Results from the Paper Edit Add Remove

Methods Edit Add Remove

Code

Add Remove Mark official

Tasks

Add Remove

Datasets

Results from the Paper

Add Remove

Methods

Add Remove