SkipInit Explained | Papers With Code

Method Name:*

Method Full Name:*

Description with Markdown (optional):

**SkipInit** is a method that aims to allow [normalization](https://paperswithcode.com/methods/category/normalization)-free training of neural networks by downscaling [residual branches](https://paperswithcode.com/method/residual-block) at initialization.  This is achieved by including a learnable scalar multiplier at the end of each residual branch, initialized to $\alpha$.

The method is motivated by theoretical findings that [batch normalization](https://paperswithcode.com/method/batch-normalization) downscales the hidden activations on the residual branch by a factor on the order of the square root of the network depth (at initialization). Therefore, as the depth of a residual network is increased, the residual blocks are increasingly dominated by the [skip connection](https://paperswithcode.com/method/residual-connection), which drives the functions computed by residual blocks closer to the identity, preserving signal propagation and ensuring well-behaved gradients. This leads to the proposed method which can achieve this property through an [initialization](https://paperswithcode.com/methods/category/initialization) strategy rather than a [normalization](https://paperswithcode.com/methods/category/normalization) strategy.

Code Snippet URL (optional):

Image

Currently: methods/e172f077-9036-4c2e-ab5f-b4c5a2858724.png Clear
Change:

Attached collections:

INITIALIZATION

Add:

New collection name:

Top-level area:

Parent collection (if any):

Description (optional):

SkipInit

Papers

Usage Over Time

Components

Categories

Add Remove

SkipInit

Papers

Usage Over Time

Components

Categories Edit Add Remove

Categories

Add Remove