TASK	DATASET	MODEL	METRIC NAME	METRIC VALUE	GLOBAL RANK
Semantic Segmentation	ADE20K	PatchConvNet-S60 (UperNet)	Validation mIoU	49.3	# 127
Semantic Segmentation	ADE20K	PatchConvNet-L120 (UperNet)	Validation mIoU	52.9	# 78
Semantic Segmentation	ADE20K	PatchConvNet-B120 (UperNet)	Validation mIoU	52.8	# 80
Semantic Segmentation	ADE20K	PatchConvNet-B60 (UperNet)	Validation mIoU	51.1	# 93
Semantic Segmentation	ADE20K val	PatchConvNet-L120 (UperNet)	mIoU	52.9	# 38
Semantic Segmentation	ADE20K val	PatchConvNet-B120 (UperNet)	mIoU	52.8	# 39
Semantic Segmentation	ADE20K val	PatchConvNet-B60 (UperNet)	mIoU	51.1	# 43
Semantic Segmentation	ADE20K val	PatchConvNet-S60 (UperNet)	mIoU	49.3	# 55
Object Detection	COCO minival	PatchConvNet-S120 (Mask R-CNN)	box AP	47.0	# 93
Object Detection	COCO minival	PatchConvNet-S60 (Mask R-CNN)	box AP	46.4	# 97
Image Classification	ImageNet	PatchConvNet-B60-21k-384	Top 1 Accuracy	86.5%	# 135
Image Classification	ImageNet	PatchConvNet-B60-21k-384	Number of params	99.4M	# 866
Image Classification	ImageNet	PatchConvNet-B120	Top 1 Accuracy	84.1%	# 325
Image Classification	ImageNet	PatchConvNet-B120	Number of params	188.6M	# 888
Image Classification	ImageNet	PatchConvNet-B60	Top 1 Accuracy	83.5%	# 391
Image Classification	ImageNet	PatchConvNet-B60	Number of params	99.4M	# 866
Image Classification	ImageNet	PatchConvNet-S120	Top 1 Accuracy	83.2%	# 413
Image Classification	ImageNet	PatchConvNet-S120	Number of params	47.7M	# 712
Image Classification	ImageNet	PatchConvNet-S60	Top 1 Accuracy	82.1%	# 525
Image Classification	ImageNet	PatchConvNet-S60	Number of params	25.2M	# 593
Image Classification	ImageNet	PatchConvNet-S60-21k-512	Top 1 Accuracy	85.4%	# 221
Image Classification	ImageNet	PatchConvNet-S60-21k-512	Number of params	25.2M	# 593
Image Classification	ImageNet	PatchConvNet-L120-21k-384	Top 1 Accuracy	87.1%	# 103
Image Classification	ImageNet	PatchConvNet-L120-21k-384	Number of params	334.3M	# 920

Badge	Markdown
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/augmenting-convolutional-networks-with/semantic-segmentation-on-ade20k-val)](https://paperswithcode.com/sota/semantic-segmentation-on-ade20k-val?p=augmenting-convolutional-networks-with)`
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/augmenting-convolutional-networks-with/semantic-segmentation-on-ade20k)](https://paperswithcode.com/sota/semantic-segmentation-on-ade20k?p=augmenting-convolutional-networks-with)`
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/augmenting-convolutional-networks-with/object-detection-on-coco-minival)](https://paperswithcode.com/sota/object-detection-on-coco-minival?p=augmenting-convolutional-networks-with)`
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/augmenting-convolutional-networks-with/image-classification-on-imagenet)](https://paperswithcode.com/sota/image-classification-on-imagenet?p=augmenting-convolutional-networks-with)`

Augmenting Convolutional networks with attention-based aggregation

27 Dec 2021 · Hugo Touvron, Matthieu Cord, Alaaeldin El-Nouby, Piotr Bojanowski, Armand Joulin, Gabriel Synnaeve, Hervé Jégou ·

We show how to augment any convolutional network with an attention-based global map to achieve non-local reasoning. We replace the final average pooling by an attention-based aggregation layer akin to a single transformer block, that weights how the patches are involved in the classification decision. We plug this learned aggregation layer with a simplistic patch-based convolutional network parametrized by 2 parameters (width and depth). In contrast with a pyramidal design, this architecture family maintains the input patch resolution across all the layers. It yields surprisingly competitive trade-offs between accuracy and complexity, in particular in terms of memory consumption, as shown by our experiments on various computer vision tasks: object classification, image segmentation and detection.

PDF Abstract

Code

Add Remove Mark official

facebookresearch/deit official

3,861

keras-team/keras-io

2,633

DarshanDeshpande/jax-models

139

dongkyuk/PatchConvNet-pytorch

mindspore-courses/External-Attentio…

Tasks

Add Remove

Classification

Image Classification

Image Segmentation

Object Detection

Semantic Segmentation

Datasets

ImageNet

MS COCO

ADE20K

Results from the Paper

Add Remove

Ranked #38 on Semantic Segmentation on ADE20K val

Get a GitHub badge

Task	Dataset	Model	Metric Name	Metric Value	Global Rank	Benchmark
Semantic Segmentation	ADE20K	PatchConvNet-S60 (UperNet)	Validation mIoU	49.3	# 127	Compare
Semantic Segmentation	ADE20K	PatchConvNet-L120 (UperNet)	Validation mIoU	52.9	# 78	Compare
Semantic Segmentation	ADE20K	PatchConvNet-B120 (UperNet)	Validation mIoU	52.8	# 80	Compare
Semantic Segmentation	ADE20K	PatchConvNet-B60 (UperNet)	Validation mIoU	51.1	# 93	Compare
Semantic Segmentation	ADE20K val	PatchConvNet-L120 (UperNet)	mIoU	52.9	# 38	Compare
Semantic Segmentation	ADE20K val	PatchConvNet-B120 (UperNet)	mIoU	52.8	# 39	Compare
Semantic Segmentation	ADE20K val	PatchConvNet-B60 (UperNet)	mIoU	51.1	# 43	Compare
Semantic Segmentation	ADE20K val	PatchConvNet-S60 (UperNet)	mIoU	49.3	# 55	Compare
Object Detection	COCO minival	PatchConvNet-S120 (Mask R-CNN)	box AP	47.0	# 93	Compare
Object Detection	COCO minival	PatchConvNet-S60 (Mask R-CNN)	box AP	46.4	# 97	Compare
Image Classification	ImageNet	PatchConvNet-B60-21k-384	Top 1 Accuracy	86.5%	# 135	Compare
Image Classification	ImageNet	PatchConvNet-B60-21k-384	Number of params	99.4M	# 866	Compare
Image Classification	ImageNet	PatchConvNet-B120	Top 1 Accuracy	84.1%	# 325	Compare
Image Classification	ImageNet	PatchConvNet-B120	Number of params	188.6M	# 888	Compare
Image Classification	ImageNet	PatchConvNet-B60	Top 1 Accuracy	83.5%	# 391	Compare
Image Classification	ImageNet	PatchConvNet-B60	Number of params	99.4M	# 866	Compare
Image Classification	ImageNet	PatchConvNet-S120	Top 1 Accuracy	83.2%	# 413	Compare
Image Classification	ImageNet	PatchConvNet-S120	Number of params	47.7M	# 712	Compare
Image Classification	ImageNet	PatchConvNet-S60	Top 1 Accuracy	82.1%	# 525	Compare
Image Classification	ImageNet	PatchConvNet-S60	Number of params	25.2M	# 593	Compare
Image Classification	ImageNet	PatchConvNet-S60-21k-512	Top 1 Accuracy	85.4%	# 221	Compare
Image Classification	ImageNet	PatchConvNet-S60-21k-512	Number of params	25.2M	# 593	Compare
Image Classification	ImageNet	PatchConvNet-L120-21k-384	Top 1 Accuracy	87.1%	# 103	Compare
Image Classification	ImageNet	PatchConvNet-L120-21k-384	Number of params	334.3M	# 920	Compare

Methods

Add Remove

Average Pooling • Class Attention • LayerScale

Edit Social Preview

Augmenting Convolutional networks with attention-based aggregation

Code Edit Add Remove Mark official

Tasks Edit Add Remove

Datasets Edit

Results from the Paper Edit Add Remove

Methods Edit Add Remove

Code

Add Remove Mark official

Tasks

Add Remove

Datasets

Results from the Paper

Add Remove

Methods

Add Remove