TASK	DATASET	MODEL	METRIC NAME	METRIC VALUE	GLOBAL RANK
Self-Supervised Image Classification	ImageNet	iGPT-XL (64x64, 15360 features)	Top 1 Accuracy	72.0%	# 91
Self-Supervised Image Classification	ImageNet	iGPT-XL (64x64, 15360 features)	Number of Params	6801M	# 1
Self-Supervised Image Classification	ImageNet	iGPT-XL (64x64, 3072 features)	Top 1 Accuracy	68.7%	# 98
Self-Supervised Image Classification	ImageNet	iGPT-L (48x48)	Top 1 Accuracy	65.2%	# 106
Self-Supervised Image Classification	ImageNet	iGPT-L (32x32)	Top 1 Accuracy	60.3%	# 116
Image Classification	STL-10	iGPT-L	Percentage correct	97.1	# 15
Image Classification	STL-10	AMDIM-L	Percentage correct	94.2	# 24

Badge	Markdown
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/generative-pretraining-from-pixels/image-classification-on-stl-10)](https://paperswithcode.com/sota/image-classification-on-stl-10?p=generative-pretraining-from-pixels)`
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/generative-pretraining-from-pixels/self-supervised-image-classification-on)](https://paperswithcode.com/sota/self-supervised-image-classification-on?p=generative-pretraining-from-pixels)`

Generative Pretraining from Pixels

ICML 2020 · Mark Chen, Alec Radford, Rewon Child, Jeff Wu, Heewoo Jun, Prafulla Dhariwal, David Luan, Ilya Sutskever ·

Inspired by progress in unsupervised representation learning for natural language, we examine whether similar models can learn useful representations for images. We train a sequence Transformer to auto-regressively predict pixels, without incorporating knowledge of the 2D input structure. Despite training on low-resolution ImageNet without labels, we find that a GPT-2 scale model learns strong image representations as measured by linear probing, fine-tuning, and low-data classification. On CIFAR-10, we achieve 96.3% accuracy with a linear probe, outperforming a supervised Wide ResNet, and 99.0% accuracy with full finetuning, matching the top supervised pre-trained models. An even larger model trained on a mixture of ImageNet and web images is competitive with self-supervised benchmarks on ImageNet, achieving 72.0% top-1 accuracy on a linear probe of our features.

PDF Abstract ICML 2020 PDF

Code

Add Remove Mark official

openai/image-gpt official

2,000

EugenHotaj/pytorch-generative

400

teddykoker/image-gpt

248

apeguero1/image-gpt

Tasks

Add Remove

Image Classification

Representation Learning

Self-Supervised Image Classification

Datasets

ImageNet

STL-10

Results from the Paper

Add Remove

Ranked #15 on Image Classification on STL-10 (using extra training data)

Get a GitHub badge

Task	Dataset	Model	Metric Name	Metric Value	Global Rank	Benchmark
Self-Supervised Image Classification	ImageNet	iGPT-XL (64x64, 15360 features)	Top 1 Accuracy	72.0%	# 91	Compare
Self-Supervised Image Classification	ImageNet	iGPT-XL (64x64, 15360 features)	Number of Params	6801M	# 1	Compare
Self-Supervised Image Classification	ImageNet	iGPT-XL (64x64, 3072 features)	Top 1 Accuracy	68.7%	# 98	Compare
Self-Supervised Image Classification	ImageNet	iGPT-L (48x48)	Top 1 Accuracy	65.2%	# 106	Compare
Self-Supervised Image Classification	ImageNet	iGPT-L (32x32)	Top 1 Accuracy	60.3%	# 116	Compare
Image Classification	STL-10	iGPT-L	Percentage correct	97.1	# 15	Compare
Image Classification	STL-10	AMDIM-L	Percentage correct	94.2	# 24	Compare

Methods

Add Remove

1x1 Convolution • Adam • Average Pooling • Batch Normalization • Bottleneck Residual Block • BPE • Convolution • Cosine Annealing • Dense Connections • Discriminative Fine-Tuning • GELU • Global Average Pooling • GPT-2 • Kaiming Initialization • Layer Normalization • Linear Layer • Linear Warmup With Cosine Annealing • Max Pooling • Multi-Head Attention • Residual Block • Residual Connection • ResNet • Scaled Dot-Product Attention • Softmax • Weight Decay • Wide Residual Block • WideResNet

Edit Social Preview

Generative Pretraining from Pixels

Code Edit Add Remove Mark official

Tasks Edit Add Remove

Datasets Edit

Results from the Paper Edit Add Remove

Methods Edit Add Remove

Code

Add Remove Mark official

Tasks

Add Remove

Datasets

Results from the Paper

Add Remove

Methods

Add Remove