Bottleneck Transformer

Introduced by Srinivas et al. in Bottleneck Transformers for Visual Recognition

The Bottleneck Transformer (BoTNet) is an image classification model that incorporates self-attention for multiple computer vision tasks including image classification, object detection and instance segmentation. By just replacing the spatial convolutions with global self-attention in the final three bottleneck blocks of a ResNet and no other changes, the approach improves upon baselines significantly on instance segmentation and object detection while also reducing the parameters, with minimal overhead in latency.

Source: Bottleneck Transformers for Visual Recognition

Read Paper See Code

Papers

Paper	Code	Results	Date	Stars

Tasks

Task	Papers	Share
Instance Segmentation	2	16.67%
Classification	1	8.33%
Emotion Recognition	1	8.33%
Self-Supervised Learning	1	8.33%
Semantic Segmentation	1	8.33%
Autonomous Driving	1	8.33%
Scene Understanding	1	8.33%
Anatomy	1	8.33%
General Classification	1	8.33%