Neighborhood Attention

Introduced by Hassani et al. in Neighborhood Attention Transformer

Neighborhood Attention is a restricted self attention pattern in which each token's receptive field is limited to its nearest neighboring pixels. It was proposed in Neighborhood Attention Transformer as an alternative to other local attention mechanisms used in Hierarchical Vision Transformers.

NA is in concept similar to stand alone self attention (SASA), in that both can be implemented with a raster scan sliding window operation over the key value pair. However, NA would require a modification to handle corner pixels, which helps maintain a fixed receptive field size and an increased number of relative positions.

The primary challenge in experimenting with both NA and SASA has been computation. Simply extracting key values for each query is slow, takes up a large amount of memory, and is eventually intractable at scale. NA was therefore implemented through a new CUDA extension to PyTorch, NATTEN.

Source: Neighborhood Attention Transformer

Read Paper See Code

Papers

Paper	Code	Results	Date	Stars

Tasks

Task	Papers	Share
Semantic Segmentation	5	13.16%
Object Detection	3	7.89%
Image Classification	3	7.89%
Computational Efficiency	2	5.26%
Instance Segmentation	2	5.26%
Panoptic Segmentation	2	5.26%
3D Object Detection	1	2.63%
Autonomous Driving	1	2.63%
LIDAR Semantic Segmentation	1	2.63%

Usage Over Time

This feature is experimental; we are continuously improving our matching algorithm.

Components

Component	Type	Add Remove
🤖 No Components Found	You can add them if they exist; e.g. Mask R-CNN uses RoIAlign

Categories

Add Remove

Attention Patterns

Attention Modules

Attention Mechanisms