CPVT, or Conditional Position Encoding Vision Transformer, is a type of vision transformer which utilizes conditional positional encoding. Other than the new encodings, it follows the same architecture of ViT and DeiT.
Source: Conditional Positional Encodings for Vision TransformersPaper | Code | Results | Date | Stars |
---|
Task | Papers | Share |
---|---|---|
Instance Segmentation | 1 | 16.67% |
Semantic Segmentation | 1 | 16.67% |
Classification | 1 | 16.67% |
General Classification | 1 | 16.67% |
Image Classification | 1 | 16.67% |
Translation | 1 | 16.67% |