Shake-Shake Regularization aims to improve the generalization ability of multi-branch networks by replacing the standard summation of parallel branches with a stochastic affine combination. A typical pre-activation ResNet with 2 residual branches would follow this equation:
$$x_{i+1} = x_{i} + \mathcal{F}\left(x_{i}, \mathcal{W}_{i}^{\left(1\right)}\right) + \mathcal{F}\left(x_{i}, \mathcal{W}_{i}^{\left(2\right)}\right) $$
Shake-shake regularization introduces a random variable $\alpha_{i}$ following a uniform distribution between 0 and 1 during training:
$$x_{i+1} = x_{i} + \alpha\mathcal{F}\left(x_{i}, \mathcal{W}_{i}^{\left(1\right)}\right) + \left(1-\alpha\right)\mathcal{F}\left(x_{i}, \mathcal{W}_{i}^{\left(2\right)}\right) $$
Following the same logic as for dropout, all $\alpha_{i}$ are set to the expected value of $0.5$ at test time.
Source: Shake-Shake regularizationPaper | Code | Results | Date | Stars |
---|
Task | Papers | Share |
---|---|---|
Image Classification | 2 | 28.57% |
General Classification | 1 | 14.29% |
Object Detection | 1 | 14.29% |
Image Augmentation | 1 | 14.29% |
Image Cropping | 1 | 14.29% |
Retrieval | 1 | 14.29% |
Component | Type |
|
---|---|---|
🤖 No Components Found | You can add them if they exist; e.g. Mask R-CNN uses RoIAlign |