Methods > Computer Vision > Image Models

Data-efficient Image Transformer

Introduced by Touvron et al. in Training data-efficient image transformers & distillation through attention

A Data-Efficient Image Transformer is a type of Vision Transformer for image classification tasks. The model is trained using a teacher-student strategy specific to transformers. It relies on a distillation token ensuring that the student learns from the teacher through attention.

Source: Training data-efficient image transformers & distillation through attention

Latest Papers

PAPER DATE
Going deeper with Image Transformers
| Hugo TouvronMatthieu CordAlexandre SablayrollesGabriel SynnaeveHervé Jégou
2021-03-31
CrossViT: Cross-Attention Multi-Scale Vision Transformer for Image Classification
| Chun-Fu ChenQuanfu FanRameswar Panda
2021-03-27
ConViT: Improving Vision Transformers with Soft Convolutional Inductive Biases
| Stéphane d'AscoliHugo TouvronMatthew LeavittAri MorcosGiulio BiroliLevent Sagun
2021-03-19
Transformer in Transformer
| Kai HanAn XiaoEnhua WuJianyuan GuoChunjing XuYunhe Wang
2021-02-27
Conditional Positional Encodings for Vision Transformers
| Xiangxiang ChuZhi TianBo ZhangXinlong WangXiaolin WeiHuaxia XiaChunhua Shen
2021-02-22
Training data-efficient image transformers & distillation through attention
| Hugo TouvronMatthieu CordMatthijs DouzeFrancisco MassaAlexandre SablayrollesHervé Jégou
2020-12-23

Tasks

TASK PAPERS SHARE
Image Classification 6 75.00%
Fine-Grained Image Classification 2 25.00%

Categories