Shake-Shake Regularization

Introduced by Gastaldi in Shake-Shake regularization

Shake-Shake Regularization aims to improve the generalization ability of multi-branch networks by replacing the standard summation of parallel branches with a stochastic affine combination. A typical pre-activation ResNet with 2 residual branches would follow this equation:

$$x_{i+1} = x_{i} + \mathcal{F}\left(x_{i}, \mathcal{W}_{i}^{\left(1\right)}\right) + \mathcal{F}\left(x_{i}, \mathcal{W}_{i}^{\left(2\right)}\right) $$

Shake-shake regularization introduces a random variable $\alpha_{i}$ following a uniform distribution between 0 and 1 during training:

$$x_{i+1} = x_{i} + \alpha\mathcal{F}\left(x_{i}, \mathcal{W}_{i}^{\left(1\right)}\right) + \left(1-\alpha\right)\mathcal{F}\left(x_{i}, \mathcal{W}_{i}^{\left(2\right)}\right) $$

Following the same logic as for dropout, all $\alpha_{i}$ are set to the expected value of $0.5$ at test time.

Source: Shake-Shake regularization

Read Paper See Code

Papers

Paper	Code	Results	Date	Stars

Tasks

Task	Papers	Share
Image Classification	2	28.57%
General Classification	1	14.29%
Object Detection	1	14.29%
Image Augmentation	1	14.29%
Image Cropping	1	14.29%
Retrieval	1	14.29%

Usage Over Time

This feature is experimental; we are continuously improving our matching algorithm.

Components

Component	Type	Add Remove
🤖 No Components Found	You can add them if they exist; e.g. Mask R-CNN uses RoIAlign

Categories

Add Remove

Regularization