TASK	DATASET	MODEL	METRIC NAME	METRIC VALUE	GLOBAL RANK	REMOVE
Image Classification	ImageNet	HaloNet4 (base 128, Conv-12)	Top 1 Accuracy	85.5%	# 212
Image Classification	ImageNet	HaloNet4 (base 128, Conv-12)	Number of params	87M	# 822

Badge	Markdown
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/scaling-local-self-attention-for-parameter/image-classification-on-imagenet)](https://paperswithcode.com/sota/image-classification-on-imagenet?p=scaling-local-self-attention-for-parameter)`

Scaling Local Self-Attention for Parameter Efficient Visual Backbones

CVPR 2021 · Ashish Vaswani, Prajit Ramachandran, Aravind Srinivas, Niki Parmar, Blake Hechtman, Jonathon Shlens ·

Self-attention has the promise of improving computer vision systems due to parameter-independent scaling of receptive fields and content-dependent interactions, in contrast to parameter-dependent scaling and content-independent interactions of convolutions. Self-attention models have recently been shown to have encouraging improvements on accuracy-parameter trade-offs compared to baseline convolutional models such as ResNet-50. In this work, we aim to develop self-attention models that can outperform not just the canonical baseline models, but even the high-performing convolutional models. We propose two extensions to self-attention that, in conjunction with a more efficient implementation of self-attention, improve the speed, memory usage, and accuracy of these models. We leverage these improvements to develop a new self-attention model family, HaloNets, which reach state-of-the-art accuracies on the parameter-limited setting of the ImageNet classification benchmark. In preliminary transfer learning experiments, we find that HaloNet models outperform much larger models and have better inference performance. On harder tasks such as object detection and instance segmentation, our simple local self-attention and convolutional hybrids show improvements over very strong baselines. These results mark another step in demonstrating the efficacy of self-attention models on settings traditionally dominated by convolutional models.

PDF Abstract CVPR 2021 PDF CVPR 2021 Abstract

Code

Add Remove Mark official

rwightman/pytorch-image-models

29,758

xmu-xiaoma666/External-Attention-py…

1,552

BR-IDL/PaddleViT

1,185

leondgarse/keras_cv_attention_models

556

lucidrains/halonet-pytorch

200

See all 7 implementations

Tasks

Add Remove

Image Classification

Instance Segmentation

object-detection

Object Detection

Semantic Segmentation

Transfer Learning

Datasets

ImageNet

MS COCO

JFT-300M

Results from the Paper

Edit

Ranked #212 on Image Classification on ImageNet

Get a GitHub badge

Task	Dataset	Model	Metric Name	Metric Value	Global Rank	Result	Benchmark
Image Classification	ImageNet	HaloNet4 (base 128, Conv-12)	Top 1 Accuracy	85.5%	# 212		Compare
Image Classification	ImageNet	HaloNet4 (base 128, Conv-12)	Number of params	87M	# 822		Compare

Methods

Add Remove

HaloNet

Edit Social Preview

Scaling Local Self-Attention for Parameter Efficient Visual Backbones

Code Edit Add Remove Mark official

Tasks Edit Add Remove

Datasets Edit

Results from the Paper Edit

Methods Edit Add Remove

Code

Add Remove Mark official

Tasks

Add Remove

Datasets

Results from the Paper

Edit

Methods

Add Remove