TASK	DATASET	MODEL	METRIC NAME	METRIC VALUE	GLOBAL RANK	REMOVE
Semantic Segmentation	ADE20K	SegViT-v2 (BEiT-v2-Large)	Validation mIoU	58.2	# 16
Semantic Segmentation	ADE20K	SegViT-v2 (BEiT-v2-Large)	GFLOPs (512 x 512)	637.9	# 9

Badge	Markdown
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/segvitv2-exploring-efficient-and-continual/semantic-segmentation-on-ade20k)](https://paperswithcode.com/sota/semantic-segmentation-on-ade20k?p=segvitv2-exploring-efficient-and-continual)`

SegViTv2: Exploring Efficient and Continual Semantic Segmentation with Plain Vision Transformers

9 Jun 2023 · BoWen Zhang, Liyang Liu, Minh Hieu Phan, Zhi Tian, Chunhua Shen, Yifan Liu ·

This paper investigates the capability of plain Vision Transformers (ViTs) for semantic segmentation using the encoder-decoder framework and introduces \textbf{SegViTv2}. In this study, we introduce a novel Attention-to-Mask (\atm) module to design a lightweight decoder effective for plain ViT. The proposed ATM converts the global attention map into semantic masks for high-quality segmentation results. Our decoder outperforms the popular decoder UPerNet using various ViT backbones while consuming only about $5\%$ of the computational cost. For the encoder, we address the concern of the relatively high computational cost in the ViT-based encoders and propose a \emph{Shrunk++} structure that incorporates edge-aware query-based down-sampling (EQD) and query-based upsampling (QU) modules. The Shrunk++ structure reduces the computational cost of the encoder by up to $50\%$ while maintaining competitive performance. Furthermore, we propose to adapt SegViT for continual semantic segmentation, demonstrating nearly zero forgetting of previously learned knowledge. Experiments show that our proposed SegViTv2 surpasses recent segmentation methods on three popular benchmarks including ADE20k, COCO-Stuff-10k and PASCAL-Context datasets. The code is available through the following link: \url{https://github.com/zbwxp/SegVit}.

PDF Abstract

Code

Add Remove Mark official

zbwxp/SegVit official

177

Tasks

Add Remove

Continual Learning

Continual Semantic Segmentation

Segmentation

Semantic Segmentation

Datasets

ADE20K

COCO-Stuff

Results from the Paper

Add Remove

Ranked #16 on Semantic Segmentation on ADE20K

Get a GitHub badge

Task	Dataset	Model	Metric Name	Metric Value	Global Rank	Benchmark
Semantic Segmentation	ADE20K	SegViT-v2 (BEiT-v2-Large)	Validation mIoU	58.2	# 16	Compare
Semantic Segmentation	ADE20K	SegViT-v2 (BEiT-v2-Large)	GFLOPs (512 x 512)	637.9	# 9	Compare

Methods

Add Remove

No methods listed for this paper. Add relevant methods here

Edit Social Preview

SegViTv2: Exploring Efficient and Continual Semantic Segmentation with Plain Vision Transformers

Code Edit Add Remove Mark official

Tasks Edit Add Remove

Datasets Edit

Results from the Paper Edit Add Remove

Methods Edit Add Remove

Code

Add Remove Mark official

Tasks

Add Remove

Datasets

Results from the Paper

Add Remove

Methods

Add Remove