TASK	DATASET	MODEL	METRIC NAME	METRIC VALUE	GLOBAL RANK
Image Classification	ImageNet	RedNet-38	Top 1 Accuracy	77.6%	# 800
Image Classification	ImageNet	RedNet-38	Number of params	12.4M	# 502
Image Classification	ImageNet	RedNet-38	GFLOPs	2.2	# 154
Image Classification	ImageNet	RedNet-26	Top 1 Accuracy	75.9%	# 857
Image Classification	ImageNet	RedNet-26	Number of params	9.2M	# 468
Image Classification	ImageNet	RedNet-26	GFLOPs	1.7	# 135
Image Classification	ImageNet	RedNet-101	Top 1 Accuracy	79.1%	# 714
Image Classification	ImageNet	RedNet-101	Number of params	25.6M	# 601
Image Classification	ImageNet	RedNet-101	GFLOPs	4.7	# 220
Image Classification	ImageNet	RedNet-50	Top 1 Accuracy	78.4%	# 768
Image Classification	ImageNet	RedNet-50	Number of params	15.5M	# 517
Image Classification	ImageNet	RedNet-50	GFLOPs	2.7	# 167
Image Classification	ImageNet	RedNet-152	Top 1 Accuracy	79.3%	# 706
Image Classification	ImageNet	RedNet-152	Number of params	34M	# 657
Image Classification	ImageNet	RedNet-152	GFLOPs	6.8	# 246

Badge	Markdown
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/involution-inverting-the-inherence-of/image-classification-on-imagenet)](https://paperswithcode.com/sota/image-classification-on-imagenet?p=involution-inverting-the-inherence-of)`

Involution: Inverting the Inherence of Convolution for Visual Recognition

CVPR 2021 · Duo Li, Jie Hu, Changhu Wang, Xiangtai Li, Qi She, Lei Zhu, Tong Zhang, Qifeng Chen ·

Convolution has been the core ingredient of modern neural networks, triggering the surge of deep learning in vision. In this work, we rethink the inherent principles of standard convolution for vision tasks, specifically spatial-agnostic and channel-specific. Instead, we present a novel atomic operation for deep neural networks by inverting the aforementioned design principles of convolution, coined as involution. We additionally demystify the recent popular self-attention operator and subsume it into our involution family as an over-complicated instantiation. The proposed involution operator could be leveraged as fundamental bricks to build the new generation of neural networks for visual recognition, powering different deep learning models on several prevalent benchmarks, including ImageNet classification, COCO detection and segmentation, together with Cityscapes segmentation. Our involution-based models improve the performance of convolutional baselines using ResNet-50 by up to 1.6% top-1 accuracy, 2.5% and 2.4% bounding box AP, and 4.7% mean IoU absolutely while compressing the computational cost to 66%, 65%, 72%, and 57% on the above benchmarks, respectively. Code and pre-trained models for all the tasks are available at https://github.com/d-li14/involution.

PDF Abstract CVPR 2021 PDF CVPR 2021 Abstract

Code

Add Remove Mark official

d-li14/involution official

1,305

PaddlePaddle/PaddleClas

5,253

xmu-xiaoma666/External-Attention-py…

1,552

ChristophReich1996/Involution

103

shikishima-TasakiLab/Involution-PyT…

See all 13 implementations

Tasks

Add Remove

Image Classification

Datasets

ImageNet

MS COCO

Cityscapes

Results from the Paper

Edit

Ranked #703 on Image Classification on ImageNet

Get a GitHub badge

Task	Dataset	Model	Metric Name	Metric Value	Global Rank	Benchmark
Image Classification	ImageNet	RedNet-38	Top 1 Accuracy	77.6%	# 800	Compare
			Number of params	12.4M	# 502	Compare
			GFLOPs	2.2	# 154	Compare
Image Classification	ImageNet	RedNet-26	Top 1 Accuracy	75.9%	# 857	Compare
			Number of params	9.2M	# 468	Compare
			GFLOPs	1.7	# 135	Compare
Image Classification	ImageNet	RedNet-101	Top 1 Accuracy	79.1%	# 714	Compare
			Number of params	25.6M	# 601	Compare
			GFLOPs	4.7	# 220	Compare
Image Classification	ImageNet	RedNet-50	Top 1 Accuracy	78.4%	# 768	Compare
			Number of params	15.5M	# 517	Compare
			GFLOPs	2.7	# 167	Compare
Image Classification	ImageNet	RedNet-152	Top 1 Accuracy	79.3%	# 706	Compare
			Number of params	34M	# 657	Compare
			GFLOPs	6.8	# 246	Compare

Methods

Add Remove

Convolution • Involution

Edit Social Preview

Involution: Inverting the Inherence of Convolution for Visual Recognition

Code Edit Add Remove Mark official

Tasks Edit Add Remove

Datasets Edit

Results from the Paper Edit

Methods Edit Add Remove

Code

Add Remove Mark official

Tasks

Add Remove

Datasets

Results from the Paper

Edit

Methods

Add Remove