TASK	DATASET	MODEL	METRIC NAME	METRIC VALUE	GLOBAL RANK
Semantic Segmentation	ADE20K	MaskFormer(ResNet-101)	Validation mIoU	48.1	# 142
Semantic Segmentation	ADE20K	MaskFormer(Swin-B)	Validation mIoU	53.8	# 67
Semantic Segmentation	ADE20K val	MaskFormer (Swin-L, ImageNet-22k pretrain)	mIoU	55.6	# 26
Panoptic Segmentation	ADE20K val	MaskFormer (R101 + 6 Enc)	PQ	35.7	# 21
Panoptic Segmentation	COCO minival	MaskFormer (single-scale)	PQ	52.7	# 17
Panoptic Segmentation	COCO minival	MaskFormer (single-scale)	SQ	81.8	# 2
Panoptic Segmentation	COCO minival	MaskFormer (single-scale)	RQ	63.5	# 1
Panoptic Segmentation	COCO minival	MaskFormer (single-scale)	PQth	58.5	# 13
Panoptic Segmentation	COCO minival	MaskFormer (single-scale)	PQst	44.0	# 13
Panoptic Segmentation	COCO test-dev	MaskFormer (Swin-L)	PQ	53.3	# 9
Panoptic Segmentation	COCO test-dev	MaskFormer (Swin-L)	PQst	44.5	# 8
Panoptic Segmentation	COCO test-dev	MaskFormer (Swin-L)	PQth	59.1	# 9
Semantic Segmentation	Mapillary val	MaskFormer (ResNet-50)	mIoU	55.4	# 4

Badge	Markdown
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/per-pixel-classification-is-not-all-you-need/semantic-segmentation-on-mapillary-val)](https://paperswithcode.com/sota/semantic-segmentation-on-mapillary-val?p=per-pixel-classification-is-not-all-you-need)`
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/per-pixel-classification-is-not-all-you-need/panoptic-segmentation-on-coco-test-dev)](https://paperswithcode.com/sota/panoptic-segmentation-on-coco-test-dev?p=per-pixel-classification-is-not-all-you-need)`
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/per-pixel-classification-is-not-all-you-need/panoptic-segmentation-on-coco-minival)](https://paperswithcode.com/sota/panoptic-segmentation-on-coco-minival?p=per-pixel-classification-is-not-all-you-need)`
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/per-pixel-classification-is-not-all-you-need/panoptic-segmentation-on-ade20k-val)](https://paperswithcode.com/sota/panoptic-segmentation-on-ade20k-val?p=per-pixel-classification-is-not-all-you-need)`
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/per-pixel-classification-is-not-all-you-need/semantic-segmentation-on-ade20k-val)](https://paperswithcode.com/sota/semantic-segmentation-on-ade20k-val?p=per-pixel-classification-is-not-all-you-need)`
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/per-pixel-classification-is-not-all-you-need/semantic-segmentation-on-ade20k)](https://paperswithcode.com/sota/semantic-segmentation-on-ade20k?p=per-pixel-classification-is-not-all-you-need)`

Per-Pixel Classification is Not All You Need for Semantic Segmentation

NeurIPS 2021 · Bowen Cheng, Alexander G. Schwing, Alexander Kirillov ·

Modern approaches typically formulate semantic segmentation as a per-pixel classification task, while instance-level segmentation is handled with an alternative mask classification. Our key insight: mask classification is sufficiently general to solve both semantic- and instance-level segmentation tasks in a unified manner using the exact same model, loss, and training procedure. Following this observation, we propose MaskFormer, a simple mask classification model which predicts a set of binary masks, each associated with a single global class label prediction. Overall, the proposed mask classification-based method simplifies the landscape of effective approaches to semantic and panoptic segmentation tasks and shows excellent empirical results. In particular, we observe that MaskFormer outperforms per-pixel classification baselines when the number of classes is large. Our mask classification-based method outperforms both current state-of-the-art semantic (55.6 mIoU on ADE20K) and panoptic segmentation (52.7 PQ on COCO) models.

PDF Abstract NeurIPS 2021 PDF NeurIPS 2021 Abstract

Code

Add Remove Mark official

facebookresearch/MaskFormer official

1,287

huggingface/transformers

124,889

open-mmlab/mmdetection

27,765

Tasks

Add Remove

Classification

Panoptic Segmentation

Segmentation

Semantic Segmentation

Datasets

MS COCO

Cityscapes

ADE20K

COCO-Stuff

Mapillary Vistas Dataset

Results from the Paper

Edit

Ranked #4 on Semantic Segmentation on Mapillary val

Get a GitHub badge

Task	Dataset	Model	Metric Name	Metric Value	Global Rank	Benchmark
Semantic Segmentation	ADE20K	MaskFormer(ResNet-101)	Validation mIoU	48.1	# 142	Compare
Semantic Segmentation	ADE20K	MaskFormer(Swin-B)	Validation mIoU	53.8	# 67	Compare
Semantic Segmentation	ADE20K val	MaskFormer (Swin-L, ImageNet-22k pretrain)	mIoU	55.6	# 26	Compare
Panoptic Segmentation	ADE20K val	MaskFormer (R101 + 6 Enc)	PQ	35.7	# 21	Compare
Panoptic Segmentation	COCO minival	MaskFormer (single-scale)	PQ	52.7	# 17	Compare
			SQ	81.8	# 2	Compare
			RQ	63.5	# 1	Compare
			PQth	58.5	# 13	Compare
			PQst	44.0	# 13	Compare
Panoptic Segmentation	COCO test-dev	MaskFormer (Swin-L)	PQ	53.3	# 9	Compare
			PQst	44.5	# 8	Compare
			PQth	59.1	# 9	Compare
Semantic Segmentation	Mapillary val	MaskFormer (ResNet-50)	mIoU	55.4	# 4	Compare

Methods

Add Remove

No methods listed for this paper. Add relevant methods here

Edit Social Preview

Per-Pixel Classification is Not All You Need for Semantic Segmentation

Code Edit Add Remove Mark official

Tasks Edit Add Remove

Datasets Edit

Results from the Paper Edit

Methods Edit Add Remove

Code

Add Remove Mark official

Tasks

Add Remove

Datasets

Results from the Paper

Edit

Methods

Add Remove