TASK	DATASET	MODEL	METRIC NAME	METRIC VALUE	GLOBAL RANK
Unsupervised Semantic Segmentation	Cityscapes test	ViCE	mIoU	25.2	# 3
Unsupervised Semantic Segmentation	Cityscapes test	ViCE	Accuracy	84.3	# 3
Unsupervised Semantic Segmentation	COCO-Stuff-27	ViCE	Accuracy	64.8	# 3
Unsupervised Semantic Segmentation	COCO-Stuff-27	ViCE	mIoU	21.77	# 12

Badge	Markdown
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/vice-self-supervised-visual-concept/unsupervised-semantic-segmentation-on)](https://paperswithcode.com/sota/unsupervised-semantic-segmentation-on?p=vice-self-supervised-visual-concept)`
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/vice-self-supervised-visual-concept/unsupervised-semantic-segmentation-on-coco-7)](https://paperswithcode.com/sota/unsupervised-semantic-segmentation-on-coco-7?p=vice-self-supervised-visual-concept)`

ViCE: Improving Dense Representation Learning by Superpixelization and Contrasting Cluster Assignment

24 Nov 2021 · Robin Karlsson, Tomoki Hayashi, Keisuke Fujii, Alexander Carballo, Kento Ohtani, Kazuya Takeda ·

Recent self-supervised models have demonstrated equal or better performance than supervised methods, opening for AI systems to learn visual representations from practically unlimited data. However, these methods are typically classification-based and thus ineffective for learning high-resolution feature maps that preserve precise spatial information. This work introduces superpixels to improve self-supervised learning of dense semantically rich visual concept embeddings. Decomposing images into a small set of visually coherent regions reduces the computational complexity by $\mathcal{O}(1000)$ while preserving detail. We experimentally show that contrasting over regions improves the effectiveness of contrastive learning methods, extends their applicability to high-resolution images, improves overclustering performance, superpixels are better than grids, and regional masking improves performance. The expressiveness of our dense embeddings is demonstrated by improving the SOTA unsupervised semantic segmentation benchmark on Cityscapes, and for convolutional models on COCO.

PDF Abstract

Code

Add Remove Mark official

robin-karlsson0/vice official

Tasks

Add Remove

Contrastive Learning

Domain Generalization

Learning Word Embeddings

Representation Learning

Self-Supervised Learning

Semantic Segmentation

Superpixels

Unsupervised Semantic Segmentation

Word Embeddings

Datasets

ImageNet

MS COCO

Cityscapes

COCO-Stuff

Results from the Paper

Edit

Ranked #3 on Unsupervised Semantic Segmentation on Cityscapes test

Get a GitHub badge

Task	Dataset	Model	Metric Name	Metric Value	Global Rank	Benchmark
Unsupervised Semantic Segmentation	Cityscapes test	ViCE	mIoU	25.2	# 3	Compare
Unsupervised Semantic Segmentation	Cityscapes test	ViCE	Accuracy	84.3	# 3	Compare
Unsupervised Semantic Segmentation	COCO-Stuff-27	ViCE	Accuracy	64.8	# 3	Compare
Unsupervised Semantic Segmentation	COCO-Stuff-27	ViCE	mIoU	21.77	# 12	Compare

Methods

Add Remove

1x1 Convolution • Average Pooling • Batch Normalization • Bottleneck Residual Block • Contrastive Learning • Convolution • Global Average Pooling • Kaiming Initialization • Max Pooling • ReLU • Residual Block • Residual Connection • ResNet • SuperpixelGridMasks

Edit Social Preview

ViCE: Improving Dense Representation Learning by Superpixelization and Contrasting Cluster Assignment

Code Edit Add Remove Mark official

Tasks Edit Add Remove

Datasets Edit

Results from the Paper Edit

Methods Edit Add Remove

Code

Add Remove Mark official

Tasks

Add Remove

Datasets

Results from the Paper

Edit

Methods

Add Remove