Methods > Computer Vision > Image Models

Vision Transformer

Introduced by Dosovitskiy et al. in An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale

The Vision Transformer is a model for image classification that employs a Transformer-like architecture over patches of the image.

Source: An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale

Latest Papers

PAPER DATE
Deepfake Detection Scheme Based on Vision Transformer and Distillation
Young-Jin HeoYoung-Ju ChoiYoung-Woon LeeByung-Gyu Kim
2021-04-03
Putting NeRF on a Diet: Semantically Consistent Few-Shot View Synthesis
Ajay JainMatthew TancikPieter Abbeel
2021-04-01
Rethinking Spatial Dimensions of Vision Transformers
| Byeongho HeoSangdoo YunDongyoon HanSanghyuk ChunJunsuk ChoeSeong Joon Oh
2021-03-30
CvT: Introducing Convolutions to Vision Transformers
| Haiping WuBin XiaoNoel CodellaMengchen LiuXiyang DaiLu YuanLei Zhang
2021-03-29
Multi-Scale Vision Longformer: A New Vision Transformer for High-Resolution Image Encoding
Pengchuan ZhangXiyang DaiJianwei YangBin XiaoLu YuanLei ZhangJianfeng Gao
2021-03-29
CrossViT: Cross-Attention Multi-Scale Vision Transformer for Image Classification
| Chun-Fu ChenQuanfu FanRameswar Panda
2021-03-27
Understanding Robustness of Transformers for Image Classification
Srinadh BhojanapalliAyan ChakrabartiDaniel GlasnerDaliang LiThomas UnterthinerAndreas Veit
2021-03-26
Vision Transformers for Dense Prediction
| René RanftlAlexey BochkovskiyVladlen Koltun
2021-03-24
Danish Fungi 2020 -- Not Just Another Image Recognition Dataset
| Lukáš PicekMilan ŠulcJiří MatasJacob Heilmann-ClausenThomas S. JeppesenThomas LæssøeTobias Frøslev
2021-03-18
TransFG: A Transformer Architecture for Fine-grained Recognition
| Ju HeJie-Neng ChenShuai LiuAdam KortylewskiCheng YangYutong BaiChanghu WangAlan Yuille
2021-03-14
Vision Transformer for COVID-19 CXR Diagnosis using Chest X-ray Feature Corpus
Sangjoon ParkGwanghyun KimYujin OhJoon Beom SeoSang Min LeeJin Hwan KimSungjun MoonJae-Kwang LimJong Chul Ye
2021-03-12
Severity Quantification and Lesion Localization of COVID-19 on CXR using Vision Transformer
Gwanghyun KimSangjoon ParkYujin OhJoon Beom SeoSang Min LeeJin Hwan KimSungjun MoonJae-Kwang LimJong Chul Ye
2021-03-12
Deepfake Video Detection Using Convolutional Vision Transformer
| Deressa WodajoSolomon Atnafu
2021-02-22
Conditional Positional Encodings for Vision Transformers
| Xiangxiang ChuZhi TianBo ZhangXinlong WangXiaolin WeiHuaxia XiaChunhua Shen
2021-02-22
TransReID: Transformer-based Object Re-Identification
| Shuting HeHao LuoPichao WangFan WangHao LiWei Jiang
2021-02-08
PipeTransformer: Automated Elastic Pipelining for Distributed Training of Transformers
Chaoyang HeShen LiMahdi SoltanolkotabiSalman Avestimehr
2021-02-05
Tokens-to-Token ViT: Training Vision Transformers from Scratch on ImageNet
| Li YuanYunpeng ChenTao WangWeihao YuYujun ShiZihang JiangFrancis EH TayJiashi FengShuicheng Yan
2021-01-28
DAF:re: A Challenging, Crowd-Sourced, Large-Scale, Long-Tailed Dataset For Anime Character Recognition
| Edwin Arkel RiosWen-Huang ChengBo-Cheng Lai
2021-01-21
Investigating the Vision Transformer Model for Image Retrieval Tasks
Socratis GkeliosYiannis BoutalisSavvas A. Chatzichristofis
2021-01-11
Transformer for Image Quality Assessment
| Junyong YouJari Korhonen
2020-12-30
Toward Transformer-Based Object Detection
Josh BealEric KimEric TzengDong Huk ParkAndrew ZhaiDmitry Kislyuk
2020-12-17
AdaBins: Depth Estimation using Adaptive Bins
| Shariq Farooq BhatIbraheem AlhashimPeter Wonka
2020-11-28
On the Effectiveness of Vision Transformers for Zero-shot Face Anti-Spoofing
Anjith GeorgeSebastien Marcel
2020-11-16
An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale
| Alexey DosovitskiyLucas BeyerAlexander KolesnikovDirk WeissenbornXiaohua ZhaiThomas UnterthinerMostafa DehghaniMatthias MindererGeorg HeigoldSylvain GellyJakob UszkoreitNeil Houlsby
2020-10-22

Categories