Position Embeddings

Conditional Positional Encoding

Introduced by Chu et al. in Conditional Positional Encodings for Vision Transformers

Conditional Positional Encoding, or CPE, is a type of positional encoding for vision transformers. Unlike previous fixed or learnable positional encodings, which are predefined and independent of input tokens, CPE is dynamically generated and conditioned on the local neighborhood of the input tokens. As a result, CPE aims to generalize to the input sequences that are longer than what the model has ever seen during training. CPE can also keep the desired translation-invariance in the image classification task. CPE can be implemented with a Position Encoding Generator (PEG) and incorporated into the current Transformer framework.

Source: Conditional Positional Encodings for Vision Transformers

Papers


Paper Code Results Date Stars

Tasks


Task Papers Share
Semantic Segmentation 2 22.22%
Image Classification 2 22.22%
Instance Segmentation 1 11.11%
Novel View Synthesis 1 11.11%
Classification 1 11.11%
General Classification 1 11.11%
Translation 1 11.11%

Categories