TASK	DATASET	MODEL	METRIC NAME	METRIC VALUE	GLOBAL RANK
Image Classification	CIFAR-100	AA-Wide-ResNet	Percentage correct	81.6	# 114
Object Detection	COCO test-dev	AA-ResNet-10 + RetinaNet	box mAP	39.2	# 200
Object Detection	COCO test-dev	AA-ResNet-10 + RetinaNet	Hardware Burden	None	# 1
Object Detection	COCO test-dev	AA-ResNet-10 + RetinaNet	Operations per network pass	24.5G	# 1
Image Classification	ImageNet	AA-ResNet-152	Top 1 Accuracy	79.1%	# 714

Badge	Markdown
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/190409925/image-classification-on-cifar-100)](https://paperswithcode.com/sota/image-classification-on-cifar-100?p=190409925)`
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/190409925/object-detection-on-coco)](https://paperswithcode.com/sota/object-detection-on-coco?p=190409925)`
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/190409925/image-classification-on-imagenet)](https://paperswithcode.com/sota/image-classification-on-imagenet?p=190409925)`

Attention Augmented Convolutional Networks

ICCV 2019 · Irwan Bello, Barret Zoph, Ashish Vaswani, Jonathon Shlens, Quoc V. Le ·

Convolutional networks have been the paradigm of choice in many computer vision applications. The convolution operation however has a significant weakness in that it only operates on a local neighborhood, thus missing global information. Self-attention, on the other hand, has emerged as a recent advance to capture long range interactions, but has mostly been applied to sequence modeling and generative modeling tasks. In this paper, we consider the use of self-attention for discriminative visual tasks as an alternative to convolutions. We introduce a novel two-dimensional relative self-attention mechanism that proves competitive in replacing convolutions as a stand-alone computational primitive for image classification. We find in control experiments that the best results are obtained when combining both convolutions and self-attention. We therefore propose to augment convolutional operators with this self-attention mechanism by concatenating convolutional feature maps with a set of feature maps produced via self-attention. Extensive experiments show that Attention Augmentation leads to consistent improvements in image classification on ImageNet and object detection on COCO across many different models and scales, including ResNets and a state-of-the art mobile constrained network, while keeping the number of parameters similar. In particular, our method achieves a $1.3\%$ top-1 accuracy improvement on ImageNet classification over a ResNet50 baseline and outperforms other attention mechanisms for images such as Squeeze-and-Excitation. It also achieves an improvement of 1.4 mAP in COCO Object Detection on top of a RetinaNet baseline.

PDF Abstract ICCV 2019 PDF ICCV 2019 Abstract

Code

Add Remove Mark official

leaderj1001/Attention-Augmented-Con…

642

leaderj1001/Stand-Alone-Self-Attent…

456

titu1994/keras-attention-augmented-…

120

gan3sh500/attention-augmented-conv

sebastiani/pytorch-attention-augmen…

See all 14 implementations

Tasks

Add Remove

General Classification

Image Classification

object-detection

Object Detection

Datasets

ImageNet

MS COCO

CIFAR-100

Results from the Paper

Edit

Ranked #114 on Image Classification on CIFAR-100 (using extra training data)

Get a GitHub badge

Task	Dataset	Model	Metric Name	Metric Value	Global Rank	Benchmark
Image Classification	CIFAR-100	AA-Wide-ResNet	Percentage correct	81.6	# 114	Compare
Object Detection	COCO test-dev	AA-ResNet-10 + RetinaNet	box mAP	39.2	# 200	Compare
			Hardware Burden	None	# 1	Compare
			Operations per network pass	24.5G	# 1	Compare
Image Classification	ImageNet	AA-ResNet-152	Top 1 Accuracy	79.1%	# 714	Compare

Methods

Add Remove

1x1 Convolution • Attention-augmented Convolution • Average Pooling • Batch Normalization • Bottleneck Residual Block • Convolution • Cosine Annealing • Dense Connections • Depthwise Convolution • Depthwise Separable Convolution • Dropout • Focal Loss • FPN • Global Average Pooling • Inverted Residual Block • Kaiming Initialization • Linear Layer • Max Pooling • MnasNet • Multi-Head Attention • Pointwise Convolution • Random Horizontal Flip • Random Resized Crop • ReLU • Residual Block • Residual Connection • ResNet • RetinaNet • Scaled Dot-Product Attention • SGD with Momentum • Sigmoid Activation • Softmax • Squeeze-and-Excitation Block • Weight Decay • Wide Residual Block • WideResNet

Edit Social Preview

Attention Augmented Convolutional Networks

Code Edit Add Remove Mark official

Tasks Edit Add Remove

Datasets Edit

Results from the Paper Edit

Methods Edit Add Remove

Code

Add Remove Mark official

Tasks

Add Remove

Datasets

Results from the Paper

Edit

Methods

Add Remove