no code implementations • 15 Dec 2022 • Loic Themyr, Clement Rambour, Nicolas Thome, Toby Collins, Alexandre Hostettler
In particular, vision transformers construct compressed global representations through self-attention and learnable class tokens.