TASK	DATASET	MODEL	METRIC NAME	METRIC VALUE	GLOBAL RANK
Open Vocabulary Panoptic Segmentation	ADE20K	ODISE(Caption)	PQ	23.4	# 4
Open Vocabulary Panoptic Segmentation	ADE20K	ODISE (Label)	PQ	22.6	# 5
Open Vocabulary Semantic Segmentation	ADE20K-150	ODISE	mIoU	29.9	# 10
Open Vocabulary Semantic Segmentation	ADE20K-847	ODISE	mIoU	11.1	# 10
Open Vocabulary Semantic Segmentation	PASCAL Context-459	ODISE	mIoU	14.5	# 8
Open Vocabulary Semantic Segmentation	PASCAL Context-59	ODISE	mIoU	57.3	# 9
Open Vocabulary Semantic Segmentation	PascalVOC-20	ODISE	mIoU	84.6	# 11
Zero Shot Segmentation	Segmentation in the Wild	odise	Mean AP	38.7	# 6
Open-World Instance Segmentation	UVO	ODISE	ARmask	57.7	# 2

Badge	Markdown
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/open-vocabulary-panoptic-segmentation-with-1/open-world-instance-segmentation-on-uvo)](https://paperswithcode.com/sota/open-world-instance-segmentation-on-uvo?p=open-vocabulary-panoptic-segmentation-with-1)`
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/open-vocabulary-panoptic-segmentation-with-1/open-vocabulary-panoptic-segmentation-on)](https://paperswithcode.com/sota/open-vocabulary-panoptic-segmentation-on?p=open-vocabulary-panoptic-segmentation-with-1)`
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/open-vocabulary-panoptic-segmentation-with-1/zero-shot-segmentation-on-segmentation-in-the)](https://paperswithcode.com/sota/zero-shot-segmentation-on-segmentation-in-the?p=open-vocabulary-panoptic-segmentation-with-1)`
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/open-vocabulary-panoptic-segmentation-with-1/open-vocabulary-semantic-segmentation-on-7)](https://paperswithcode.com/sota/open-vocabulary-semantic-segmentation-on-7?p=open-vocabulary-panoptic-segmentation-with-1)`
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/open-vocabulary-panoptic-segmentation-with-1/open-vocabulary-semantic-segmentation-on-1)](https://paperswithcode.com/sota/open-vocabulary-semantic-segmentation-on-1?p=open-vocabulary-panoptic-segmentation-with-1)`
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/open-vocabulary-panoptic-segmentation-with-1/open-vocabulary-semantic-segmentation-on-2)](https://paperswithcode.com/sota/open-vocabulary-semantic-segmentation-on-2?p=open-vocabulary-panoptic-segmentation-with-1)`
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/open-vocabulary-panoptic-segmentation-with-1/open-vocabulary-semantic-segmentation-on-3)](https://paperswithcode.com/sota/open-vocabulary-semantic-segmentation-on-3?p=open-vocabulary-panoptic-segmentation-with-1)`
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/open-vocabulary-panoptic-segmentation-with-1/open-vocabulary-semantic-segmentation-on-5)](https://paperswithcode.com/sota/open-vocabulary-semantic-segmentation-on-5?p=open-vocabulary-panoptic-segmentation-with-1)`

Open-Vocabulary Panoptic Segmentation with Text-to-Image Diffusion Models

CVPR 2023 · Jiarui Xu, Sifei Liu, Arash Vahdat, Wonmin Byeon, Xiaolong Wang, Shalini De Mello ·

We present ODISE: Open-vocabulary DIffusion-based panoptic SEgmentation, which unifies pre-trained text-image diffusion and discriminative models to perform open-vocabulary panoptic segmentation. Text-to-image diffusion models have the remarkable ability to generate high-quality images with diverse open-vocabulary language descriptions. This demonstrates that their internal representation space is highly correlated with open concepts in the real world. Text-image discriminative models like CLIP, on the other hand, are good at classifying images into open-vocabulary labels. We leverage the frozen internal representations of both these models to perform panoptic segmentation of any category in the wild. Our approach outperforms the previous state of the art by significant margins on both open-vocabulary panoptic and semantic segmentation tasks. In particular, with COCO training only, our method achieves 23.4 PQ and 30.0 mIoU on the ADE20K dataset, with 8.3 PQ and 7.9 mIoU absolute improvement over the previous state of the art. We open-source our code and models at https://github.com/NVlabs/ODISE .

PDF Abstract CVPR 2023 PDF CVPR 2023 Abstract

Code

Add Remove Mark official

nvlabs/odise official

↳ Quickstart in

Colab

Spaces

799

Tasks

Add Remove

Open Vocabulary Panoptic Segmentation

Open Vocabulary Semantic Segmentation

Open-World Instance Segmentation

Panoptic Segmentation

Segmentation

Semantic Segmentation

Zero Shot Segmentation

Datasets

ImageNet

MS COCO

ADE20K

PASCAL VOC

UVO

Segmentation in the Wild

Results from the Paper

Edit

Ranked #2 on Open-World Instance Segmentation on UVO (using extra training data)

Get a GitHub badge

Task	Dataset	Model	Metric Name	Metric Value	Global Rank	Benchmark
Open Vocabulary Panoptic Segmentation	ADE20K	ODISE(Caption)	PQ	23.4	# 4	Compare
Open Vocabulary Panoptic Segmentation	ADE20K	ODISE (Label)	PQ	22.6	# 5	Compare
Open Vocabulary Semantic Segmentation	ADE20K-150	ODISE	mIoU	29.9	# 10	Compare
Open Vocabulary Semantic Segmentation	ADE20K-847	ODISE	mIoU	11.1	# 10	Compare
Open Vocabulary Semantic Segmentation	PASCAL Context-459	ODISE	mIoU	14.5	# 8	Compare
Open Vocabulary Semantic Segmentation	PASCAL Context-59	ODISE	mIoU	57.3	# 9	Compare
Open Vocabulary Semantic Segmentation	PascalVOC-20	ODISE	mIoU	84.6	# 11	Compare
Zero Shot Segmentation	Segmentation in the Wild	odise	Mean AP	38.7	# 6	Compare
Open-World Instance Segmentation	UVO	ODISE	ARmask	57.7	# 2	Compare

Methods

Add Remove

CLIP • Diffusion

Edit Social Preview

Open-Vocabulary Panoptic Segmentation with Text-to-Image Diffusion Models

Code Edit Add Remove Mark official

Tasks Edit Add Remove

Datasets Edit

Results from the Paper Edit

Methods Edit Add Remove

Code

Add Remove Mark official

Tasks

Add Remove

Datasets

Results from the Paper

Edit

Methods

Add Remove