Vision Transformers

Conditional Position Encoding Vision Transformer

Introduced by Chu et al. in Conditional Positional Encodings for Vision Transformers

CPVT, or Conditional Position Encoding Vision Transformer, is a type of vision transformer which utilizes conditional positional encoding. Other than the new encodings, it follows the same architecture of ViT and DeiT.

Source: Conditional Positional Encodings for Vision Transformers

Papers


Paper Code Results Date Stars

Tasks


Task Papers Share
Instance Segmentation 1 16.67%
Semantic Segmentation 1 16.67%
Classification 1 16.67%
General Classification 1 16.67%
Image Classification 1 16.67%
Translation 1 16.67%

Categories