TASK	DATASET	MODEL	METRIC NAME	METRIC VALUE	GLOBAL RANK
Self-Supervised Image Classification	ImageNet	CMC (ResNet-50 x2)	Top 1 Accuracy	70.6%	# 96
Self-Supervised Image Classification	ImageNet	CMC (ResNet-50 x2)	Top 5 Accuracy	89.7%	# 25
Self-Supervised Image Classification	ImageNet	CMC (ResNet-50 x2)	Number of Params	188M	# 27
Self-Supervised Image Classification	ImageNet	CMC (ResNet-50)	Top 1 Accuracy	66.2%	# 103
Self-Supervised Image Classification	ImageNet	CMC (ResNet-50)	Top 5 Accuracy	87.0%	# 27
Self-Supervised Image Classification	ImageNet	CMC (ResNet-50)	Number of Params	47M	# 45
Self-Supervised Image Classification	ImageNet	CMC (ResNet-101)-deprecated	Top 1 Accuracy	65.0%	# 107
Self-Supervised Image Classification	ImageNet	CMC (ResNet-101)-deprecated	Top 5 Accuracy	86.0%	# 28
Self-Supervised Action Recognition	UCF101	Contrastive Multiview Coding (CaffeNet x2)	3-fold Accuracy	59.1	# 48
Self-Supervised Action Recognition	UCF101	Contrastive Multiview Coding (CaffeNet x2)	Pre-Training Dataset	UCF101	# 1
Self-Supervised Action Recognition	UCF101	Contrastive Multiview Coding (CaffeNet x2)	Frozen	false	# 1

Badge	Markdown
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/contrastive-multiview-coding/self-supervised-action-recognition-on-ucf101)](https://paperswithcode.com/sota/self-supervised-action-recognition-on-ucf101?p=contrastive-multiview-coding)`
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/contrastive-multiview-coding/self-supervised-image-classification-on)](https://paperswithcode.com/sota/self-supervised-image-classification-on?p=contrastive-multiview-coding)`

Contrastive Multiview Coding

ECCV 2020 · Yonglong Tian, Dilip Krishnan, Phillip Isola ·

Humans view the world through many sensory channels, e.g., the long-wavelength light channel, viewed by the left eye, or the high-frequency vibrations channel, heard by the right ear. Each view is noisy and incomplete, but important factors, such as physics, geometry, and semantics, tend to be shared between all views (e.g., a "dog" can be seen, heard, and felt). We investigate the classic hypothesis that a powerful representation is one that models view-invariant factors. We study this hypothesis under the framework of multiview contrastive learning, where we learn a representation that aims to maximize mutual information between different views of the same scene but is otherwise compact. Our approach scales to any number of views, and is view-agnostic. We analyze key properties of the approach that make it work, finding that the contrastive loss outperforms a popular alternative based on cross-view prediction, and that the more views we learn from, the better the resulting representation captures underlying scene semantics. Our approach achieves state-of-the-art results on image and video unsupervised learning benchmarks. Code is released at: http://github.com/HobbitLong/CMC/.

PDF Abstract ECCV 2020 PDF ECCV 2020 Abstract

Code

Add Remove Mark official

HobbitLong/CMC official

1,273

HobbitLong/PyContrast

1,909

szq0214/Rethinking-Image-Mixture-fo…

↳ Quickstart in

Colab

148

SsnL/moco_align_uniform

SsnL/moco

See all 8 implementations

Tasks

Add Remove

Contrastive Learning

Self-Supervised Action Recognition

Self-Supervised Image Classification

Datasets

ImageNet

UCF101

NYUv2

Results from the Paper

Edit

Ranked #48 on Self-Supervised Action Recognition on UCF101

Get a GitHub badge

Task	Dataset	Model	Metric Name	Metric Value	Global Rank	Benchmark
Self-Supervised Image Classification	ImageNet	CMC (ResNet-50 x2)	Top 1 Accuracy	70.6%	# 96	Compare
			Top 5 Accuracy	89.7%	# 25	Compare
			Number of Params	188M	# 27	Compare
Self-Supervised Image Classification	ImageNet	CMC (ResNet-50)	Top 1 Accuracy	66.2%	# 103	Compare
			Top 5 Accuracy	87.0%	# 27	Compare
			Number of Params	47M	# 45	Compare
Self-Supervised Image Classification	ImageNet	CMC (ResNet-101)-deprecated	Top 1 Accuracy	65.0%	# 107	Compare
Self-Supervised Image Classification	ImageNet	CMC (ResNet-101)-deprecated	Top 5 Accuracy	86.0%	# 28	Compare

Results from Other Papers

Task	Dataset	Model	Metric Name	Metric Value	Rank	Compare
Self-Supervised Action Recognition	UCF101	Contrastive Multiview Coding (CaffeNet x2)	3-fold Accuracy	59.1	# 48	See all
			Pre-Training Dataset	UCF101	# 1	See all
			Frozen	false	# 1	See all

Methods

Add Remove

1x1 Convolution • Adam • AlexNet • Average Pooling • Batch Normalization • Bottleneck Residual Block • Contrastive Multiview Coding • Convolution • Dense Connections • Dropout • Global Average Pooling • Grouped Convolution • InfoNCE • Kaiming Initialization • Local Response Normalization • Max Pooling • Random Horizontal Flip • Random Resized Crop • ReLU • Residual Block • Residual Connection • ResNet • SGD with Momentum • Softmax • Weight Decay

Edit Social Preview

Contrastive Multiview Coding

Code Edit Add Remove Mark official

Tasks Edit Add Remove

Datasets Edit

Results from the Paper Edit