Methods > Computer Vision > Image Models

Bottleneck Transformer

Introduced by Srinivas et al. in Bottleneck Transformers for Visual Recognition

The Bottleneck Transformer (BoTNet) is an image classification model that incorporates self-attention for multiple computer vision tasks including image classification, object detection and instance segmentation. By just replacing the spatial convolutions with global self-attention in the final three bottleneck blocks of a ResNet and no other changes, the approach improves upon baselines significantly on instance segmentation and object detection while also reducing the parameters, with minimal overhead in latency.

Source: Bottleneck Transformers for Visual Recognition

Latest Papers

PAPER DATE
Bottleneck Transformers for Visual Recognition
| Aravind SrinivasTsung-Yi LinNiki ParmarJonathon ShlensPieter AbbeelAshish Vaswani
2021-01-27

Tasks

TASK PAPERS SHARE
Image Classification 1 33.33%
Instance Segmentation 1 33.33%
Object Detection 1 33.33%

Categories