Image Models

Bottleneck Transformer

Introduced by Srinivas et al. in Bottleneck Transformers for Visual Recognition

The Bottleneck Transformer (BoTNet) is an image classification model that incorporates self-attention for multiple computer vision tasks including image classification, object detection and instance segmentation. By just replacing the spatial convolutions with global self-attention in the final three bottleneck blocks of a ResNet and no other changes, the approach improves upon baselines significantly on instance segmentation and object detection while also reducing the parameters, with minimal overhead in latency.

Source: Bottleneck Transformers for Visual Recognition


Paper Code Results Date Stars