A Data-Efficient Image Transformer is a type of Vision Transformer for image classification tasks. The model is trained using a teacher-student strategy specific to transformers. It relies on a distillation token ensuring that the student learns from the teacher through attention.
Source: Training data-efficient image transformers & distillation through attentionPaper | Code | Results | Date | Stars |
---|
Task | Papers | Share |
---|---|---|
Image Classification | 25 | 19.38% |
Object Detection | 11 | 8.53% |
Quantization | 10 | 7.75% |
Semantic Segmentation | 8 | 6.20% |
Self-Supervised Learning | 5 | 3.88% |
Efficient ViTs | 4 | 3.10% |
Fine-Grained Image Classification | 4 | 3.10% |
Mamba | 3 | 2.33% |
Document Image Classification | 3 | 2.33% |
Component | Type |
|
---|---|---|
![]() |
Attention Mechanisms | |
![]() |
Regularization | |
![]() |
Regularization | |
![]() |
Feedforward Networks | |
![]() |
Attention Modules |