Fixup Initialization

Introduced by Zhang et al. in Fixup Initialization: Residual Learning Without Normalization

FixUp Initialization, or Fixed-Update Initialization, is an initialization method that rescales the standard initialization of residual branches by adjusting for the network architecture. Fixup aims to enables training very deep residual networks stably at a maximal learning rate without normalization.

The steps are as follows:

  1. Initialize the classification layer and the last layer of each residual branch to 0.

  2. Initialize every other layer using a standard method, e.g. Kaiming Initialization, and scale only the weight layers inside residual branches by $L^{\frac{1}{2m-2}}$.

  3. Add a scalar multiplier (initialized at 1) in every branch and a scalar bias (initialized at 0) before each convolution, linear, and element-wise activation layer.

Source: Fixup Initialization: Residual Learning Without Normalization


Paper Code Results Date Stars


Task Papers Share
Model Compression 1 20.00%
Quantization 1 20.00%
General Classification 1 20.00%
Image Classification 1 20.00%
Machine Translation 1 20.00%


Component Type
🤖 No Components Found You can add them if they exist; e.g. Mask R-CNN uses RoIAlign