TASK	DATASET	MODEL	METRIC NAME	METRIC VALUE	GLOBAL RANK
Image Classification	CIFAR-10	CaiT-M-36 U 224	Percentage correct	99.4	# 5
Image Classification	CIFAR-100	CaiT-M-36 U 224	Percentage correct	93.1	# 11
Image Classification	Flowers-102	CaiT-M-36 U 224	Accuracy	99.1	# 13
Image Classification	ImageNet	CAIT-XXS-36	Top 1 Accuracy	82.2%	# 510
Image Classification	ImageNet	CAIT-XXS-36	Number of params	17.3M	# 522
Image Classification	ImageNet	CAIT-XXS-36	GFLOPs	14.3	# 335
Image Classification	ImageNet	CAIT-XS-24	Top 1 Accuracy	84.1%	# 325
Image Classification	ImageNet	CAIT-XS-24	Number of params	26.6M	# 613
Image Classification	ImageNet	CAIT-XS-24	GFLOPs	19.3	# 364
Image Classification	ImageNet	CAIT-XS-36	Top 1 Accuracy	84.8%	# 270
Image Classification	ImageNet	CAIT-XS-36	Number of params	38.6M	# 665
Image Classification	ImageNet	CAIT-XS-36	GFLOPs	28.8	# 389
Image Classification	ImageNet	CAIT-S-24	Top 1 Accuracy	85.1%	# 245
Image Classification	ImageNet	CAIT-S-24	Number of params	46.9M	# 710
Image Classification	ImageNet	CAIT-S-24	GFLOPs	32.2	# 398
Image Classification	ImageNet	CAIT-S-36	Top 1 Accuracy	85.4%	# 221
Image Classification	ImageNet	CAIT-S-36	Number of params	68.2M	# 786
Image Classification	ImageNet	CAIT-S-36	GFLOPs	48	# 421
Image Classification	ImageNet	CAIT-M-24	Top 1 Accuracy	85.8%	# 187
Image Classification	ImageNet	CAIT-M-24	Number of params	185.9M	# 887
Image Classification	ImageNet	CAIT-M-24	GFLOPs	116.1	# 458
Image Classification	ImageNet	CAIT-M-36	Top 1 Accuracy	86.1%	# 170
Image Classification	ImageNet	CAIT-M-36	Number of params	270.9M	# 908
Image Classification	ImageNet	CAIT-M-36	GFLOPs	173.3	# 464
Image Classification	ImageNet	CAIT-M36-448	Top 1 Accuracy	86.3%	# 153
Image Classification	ImageNet	CAIT-M36-448	Number of params	271M	# 909
Image Classification	ImageNet	CAIT-M36-448	GFLOPs	247.8	# 472
Image Classification	ImageNet	CAIT-S-48	Top 1 Accuracy	85.3%	# 231
Image Classification	ImageNet	CAIT-S-48	Number of params	89.5M	# 845
Image Classification	ImageNet	CAIT-S-48	GFLOPs	63.8	# 435
Image Classification	ImageNet	CAIT-XXS-24	Top 1 Accuracy	80.9%	# 618
Image Classification	ImageNet	CAIT-XXS-24	Number of params	12M	# 496
Image Classification	ImageNet	CAIT-XXS-24	GFLOPs	9.6	# 292
Image Classification	ImageNet	CaiT-M-48-448	Top 1 Accuracy	86.5%	# 135
Image Classification	ImageNet	CaiT-M-48-448	Number of params	438M	# 930
Image Classification	ImageNet	CaiT-M-48-448	GFLOPs	377.3	# 480
Image Classification	ImageNet ReaL	CAIT-M36-448	Accuracy	90.2%	# 19
Image Classification	ImageNet V2	CAIT-M36-448	Top 1 Accuracy	76.7	# 16
Image Classification	iNaturalist 2018	CaiT-M-36 U 224	Top-1 Accuracy	78%	# 18
Image Classification	iNaturalist 2019	CaiT-M-36 U 224	Top-1 Accuracy	81.8	# 7
Image Classification	Stanford Cars	CaiT-M-36 U 224	Accuracy	94.2	# 5

Badge	Markdown
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/going-deeper-with-image-transformers/image-classification-on-cifar-10)](https://paperswithcode.com/sota/image-classification-on-cifar-10?p=going-deeper-with-image-transformers)`
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/going-deeper-with-image-transformers/image-classification-on-stanford-cars)](https://paperswithcode.com/sota/image-classification-on-stanford-cars?p=going-deeper-with-image-transformers)`
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/going-deeper-with-image-transformers/image-classification-on-inaturalist-2019)](https://paperswithcode.com/sota/image-classification-on-inaturalist-2019?p=going-deeper-with-image-transformers)`
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/going-deeper-with-image-transformers/image-classification-on-cifar-100)](https://paperswithcode.com/sota/image-classification-on-cifar-100?p=going-deeper-with-image-transformers)`
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/going-deeper-with-image-transformers/image-classification-on-flowers-102)](https://paperswithcode.com/sota/image-classification-on-flowers-102?p=going-deeper-with-image-transformers)`
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/going-deeper-with-image-transformers/image-classification-on-imagenet-v2)](https://paperswithcode.com/sota/image-classification-on-imagenet-v2?p=going-deeper-with-image-transformers)`
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/going-deeper-with-image-transformers/image-classification-on-inaturalist-2018)](https://paperswithcode.com/sota/image-classification-on-inaturalist-2018?p=going-deeper-with-image-transformers)`
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/going-deeper-with-image-transformers/image-classification-on-imagenet-real)](https://paperswithcode.com/sota/image-classification-on-imagenet-real?p=going-deeper-with-image-transformers)`
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/going-deeper-with-image-transformers/image-classification-on-imagenet)](https://paperswithcode.com/sota/image-classification-on-imagenet?p=going-deeper-with-image-transformers)`

Going deeper with Image Transformers

ICCV 2021 · Hugo Touvron, Matthieu Cord, Alexandre Sablayrolles, Gabriel Synnaeve, Hervé Jégou ·

Transformers have been recently adapted for large scale image classification, achieving high scores shaking up the long supremacy of convolutional neural networks. However the optimization of image transformers has been little studied so far. In this work, we build and optimize deeper transformer networks for image classification. In particular, we investigate the interplay of architecture and optimization of such dedicated transformers. We make two transformers architecture changes that significantly improve the accuracy of deep transformers. This leads us to produce models whose performance does not saturate early with more depth, for instance we obtain 86.5% top-1 accuracy on Imagenet when training with no external data, we thus attain the current SOTA with less FLOPs and parameters. Moreover, our best model establishes the new state of the art on Imagenet with Reassessed labels and Imagenet-V2 / match frequency, in the setting with no additional training data. We share our code and models.

PDF Abstract ICCV 2021 PDF ICCV 2021 Abstract

Code

Add Remove Mark official

rwightman/pytorch-image-models official

29,671

facebookresearch/deit official

3,861

lucidrains/vit-pytorch

17,879

BR-IDL/PaddleViT

1,183

martinsbruveris/tensorflow-image-mo…

279

See all 19 implementations

Tasks

Add Remove

Image Classification

Transfer Learning

Datasets

CIFAR-10

ImageNet

CIFAR-100

Oxford 102 Flower ImageNet-1K

Stanford Cars

iNaturalist Imagenet ReaL

Results from the Paper

Edit

Ranked #5 on Image Classification on CIFAR-10 (using extra training data)

Get a GitHub badge

Task	Dataset	Model	Metric Name	Metric Value	Global Rank	Benchmark
Image Classification	CIFAR-10	CaiT-M-36 U 224	Percentage correct	99.4	# 5	Compare
Image Classification	CIFAR-100	CaiT-M-36 U 224	Percentage correct	93.1	# 11	Compare
Image Classification	Flowers-102	CaiT-M-36 U 224	Accuracy	99.1	# 13	Compare
Image Classification	ImageNet	CAIT-XXS-36	Top 1 Accuracy	82.2%	# 510	Compare
			Number of params	17.3M	# 522	Compare
			GFLOPs	14.3	# 335	Compare
Image Classification	ImageNet	CAIT-XS-24	Top 1 Accuracy	84.1%	# 325	Compare
			Number of params	26.6M	# 613	Compare
			GFLOPs	19.3	# 364	Compare
Image Classification	ImageNet	CAIT-XS-36	Top 1 Accuracy	84.8%	# 270	Compare
			Number of params	38.6M	# 665	Compare
			GFLOPs	28.8	# 389	Compare
Image Classification	ImageNet	CAIT-S-24	Top 1 Accuracy	85.1%	# 245	Compare
			Number of params	46.9M	# 710	Compare
			GFLOPs	32.2	# 398	Compare
Image Classification	ImageNet	CAIT-S-36	Top 1 Accuracy	85.4%	# 221	Compare
			Number of params	68.2M	# 786	Compare
			GFLOPs	48	# 421	Compare
Image Classification	ImageNet	CAIT-M-24	Top 1 Accuracy	85.8%	# 187	Compare
			Number of params	185.9M	# 887	Compare
			GFLOPs	116.1	# 458	Compare
Image Classification	ImageNet	CAIT-M-36	Top 1 Accuracy	86.1%	# 170	Compare
			Number of params	270.9M	# 908	Compare
			GFLOPs	173.3	# 464	Compare
Image Classification	ImageNet	CAIT-M36-448	Top 1 Accuracy	86.3%	# 153	Compare
			Number of params	271M	# 909	Compare
			GFLOPs	247.8	# 472	Compare
Image Classification	ImageNet	CAIT-S-48	Top 1 Accuracy	85.3%	# 231	Compare
			Number of params	89.5M	# 845	Compare
			GFLOPs	63.8	# 435	Compare
Image Classification	ImageNet	CAIT-XXS-24	Top 1 Accuracy	80.9%	# 618	Compare
			Number of params	12M	# 496	Compare
			GFLOPs	9.6	# 292	Compare
Image Classification	ImageNet	CaiT-M-48-448	Top 1 Accuracy	86.5%	# 135	Compare
			Number of params	438M	# 930	Compare
			GFLOPs	377.3	# 480	Compare
Image Classification	ImageNet ReaL	CAIT-M36-448	Accuracy	90.2%	# 19	Compare
Image Classification	ImageNet V2	CAIT-M36-448	Top 1 Accuracy	76.7	# 16	Compare
Image Classification	iNaturalist 2018	CaiT-M-36 U 224	Top-1 Accuracy	78%	# 18	Compare
Image Classification	iNaturalist 2019	CaiT-M-36 U 224	Top-1 Accuracy	81.8	# 7	Compare
Image Classification	Stanford Cars	CaiT-M-36 U 224	Accuracy	94.2	# 5	Compare

Methods

Add Remove

CaiT • Class Attention • DeiT • Dense Connections • Feedforward Network • Layer Normalization • LayerScale • Linear Layer • Multi-Head Attention • Residual Connection • Scaled Dot-Product Attention • Softmax

Edit Social Preview

Going deeper with Image Transformers

Code Edit Add Remove Mark official

Tasks Edit Add Remove

Datasets Edit

Results from the Paper Edit

Methods Edit Add Remove

Code

Add Remove Mark official

Tasks

Add Remove

Datasets

Results from the Paper

Edit

Methods

Add Remove