Image Models

Vision Transformer

Introduced by Dosovitskiy et al. in An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale

The Vision Transformer is a model for image classification that employs a Transformer-like architecture over patches of the image.

Source: An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale

Papers


Paper Code Results Date Stars

Categories