TASK	DATASET	MODEL	METRIC NAME	METRIC VALUE	GLOBAL RANK
Weakly-Supervised Object Localization	CUB-200-2011	GenPromp	Top-1 Localization Accuracy	87.0	# 1
Weakly-Supervised Object Localization	CUB-200-2011	Stable diffusion	Top-1 Localization Accuracy	87.0	# 2
Weakly-Supervised Object Localization	CUB-200-2011	Stable diffusion	GT-known localization accuracy	98.0	# 2
Weakly-Supervised Object Localization	ImageNet	Stable diffusion	GT-known localization accuracy	75.0	# 1
Weakly-Supervised Object Localization	ImageNet	Stable diffusion	Top-1 Localization Accuracy	65.2	# 1

Badge	Markdown
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/generative-prompt-model-for-weakly-supervised/weakly-supervised-object-localization-on-cub)](https://paperswithcode.com/sota/weakly-supervised-object-localization-on-cub?p=generative-prompt-model-for-weakly-supervised)`
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/generative-prompt-model-for-weakly-supervised/weakly-supervised-object-localization-on-2)](https://paperswithcode.com/sota/weakly-supervised-object-localization-on-2?p=generative-prompt-model-for-weakly-supervised)`
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/generative-prompt-model-for-weakly-supervised/weakly-supervised-object-localization-on-cub-2)](https://paperswithcode.com/sota/weakly-supervised-object-localization-on-cub-2?p=generative-prompt-model-for-weakly-supervised)`

Generative Prompt Model for Weakly Supervised Object Localization

ICCV 2023 · Yuzhong Zhao, Qixiang Ye, Weijia Wu, Chunhua Shen, Fang Wan ·

Weakly supervised object localization (WSOL) remains challenging when learning object localization models from image category labels. Conventional methods that discriminatively train activation models ignore representative yet less discriminative object parts. In this study, we propose a generative prompt model (GenPromp), defining the first generative pipeline to localize less discriminative object parts by formulating WSOL as a conditional image denoising procedure. During training, GenPromp converts image category labels to learnable prompt embeddings which are fed to a generative model to conditionally recover the input image with noise and learn representative embeddings. During inference, enPromp combines the representative embeddings with discriminative embeddings (queried from an off-the-shelf vision-language model) for both representative and discriminative capacity. The combined embeddings are finally used to generate multi-scale high-quality attention maps, which facilitate localizing full object extent. Experiments on CUB-200-2011 and ILSVRC show that GenPromp respectively outperforms the best discriminative models by 5.2% and 5.6% (Top-1 Loc), setting a solid baseline for WSOL with the generative model. Code is available at https://github.com/callsys/GenPromp.

PDF Abstract ICCV 2023 PDF ICCV 2023 Abstract

Code

Add Remove Mark official

callsys/genpromp official

Tasks

Add Remove

Denoising

Image Denoising

Language Modelling

Object

Object Localization

Weakly-Supervised Object Localization

Datasets

ImageNet

CUB-200-2011 ImageNet-1K

LAION-5B

Results from the Paper

Edit

Ranked #1 on Weakly-Supervised Object Localization on CUB-200-2011 (Top-1 Localization Accuracy metric, using extra training data)

Get a GitHub badge

Task	Dataset	Model	Metric Name	Metric Value	Global Rank	Benchmark
Weakly-Supervised Object Localization	CUB-200-2011	GenPromp	Top-1 Localization Accuracy	87.0	# 1	Compare
Weakly-Supervised Object Localization	CUB-200-2011	Stable diffusion	Top-1 Localization Accuracy	87.0	# 2	Compare
Weakly-Supervised Object Localization	CUB-200-2011	Stable diffusion	GT-known localization accuracy	98.0	# 2	Compare
Weakly-Supervised Object Localization	ImageNet	Stable diffusion	GT-known localization accuracy	75.0	# 1	Compare
Weakly-Supervised Object Localization	ImageNet	Stable diffusion	Top-1 Localization Accuracy	65.2	# 1	Compare

Methods

Add Remove

No methods listed for this paper. Add relevant methods here

Edit Social Preview

Generative Prompt Model for Weakly Supervised Object Localization

Code Edit Add Remove Mark official

Tasks Edit Add Remove

Datasets Edit

Results from the Paper Edit

Methods Edit Add Remove

Code

Add Remove Mark official

Tasks

Add Remove

Datasets

Results from the Paper

Edit

Methods

Add Remove