Affine Operator

Introduced by Touvron et al. in ResMLP: Feedforward networks for image classification with data-efficient training

The Affine Operator is an affine transformation layer introduced in the ResMLP architecture. This replaces layer normalization, as in Transformer based networks, which is possible since in the ResMLP, there are no self-attention layers which makes training more stable - hence allowing a more simple affine transformation.

The affine operator is defined as:

$$ \operatorname{Aff}_{\mathbf{\alpha}, \mathbf{\beta}}(\mathbf{x})=\operatorname{Diag}(\mathbf{\alpha}) \mathbf{x}+\mathbf{\beta} $$

where $\alpha$ and $\beta$ are learnable weight vectors. This operation only rescales and shifts the input element-wise. This operation has several advantages over other normalization operations: first, as opposed to Layer Normalization, it has no cost at inference time, since it can absorbed in the adjacent linear layer. Second, as opposed to BatchNorm and Layer Normalization, the Aff operator does not depend on batch statistics.

Source: ResMLP: Feedforward networks for image classification with data-efficient training

Read Paper See Code

Papers

Paper	Code	Results	Date	Stars

Tasks

Task	Papers	Share
Image Classification	4	19.05%
Translation	2	9.52%
Object Detection	2	9.52%
Semantic Segmentation	2	9.52%
Color Manipulation	1	4.76%
Image Enhancement	1	4.76%
Photo Retouching	1	4.76%
Tone Mapping	1	4.76%
Out-of-Distribution Detection	1	4.76%

Usage Over Time

This feature is experimental; we are continuously improving our matching algorithm.

Components

Component	Type	Add Remove
🤖 No Components Found	You can add them if they exist; e.g. Mask R-CNN uses RoIAlign

Categories

Add Remove

Feedforward Networks