Stochastic Depth aims to shrink the depth of a network during training, while keeping it unchanged during testing. This is achieved by randomly dropping entire ResBlocks during training and bypassing their transformations through skip connections.
Let $b_{l} \in$ {$0, 1$} denote a Bernoulli random variable, which indicates whether the $l$th ResBlock is active ($b_{l} = 1$) or inactive ($b_{l} = 0$). Further, let us denote the “survival” probability of ResBlock $l$ as $p_{l} = \text{Pr}\left(b_{l} = 1\right)$. With this definition we can bypass the $l$th ResBlock by multiplying its function $f_{l}$ with $b_{l}$ and we extend the update rule to:
$$ H_{l} = \text{ReLU}\left(b_{l}f_{l}\left(H_{l-1}\right) + \text{id}\left(H_{l-1}\right)\right) $$
If $b_{l} = 1$, this reduces to the original ResNet update and this ResBlock remains unchanged. If $b_{l} = 0$, the ResBlock reduces to the identity function, $H_{l} = \text{id}\left((H_{l}−1\right)$.
Source: Deep Networks with Stochastic DepthPaper | Code | Results | Date | Stars |
---|
Task | Papers | Share |
---|---|---|
Semantic Segmentation | 70 | 11.99% |
Image Classification | 52 | 8.90% |
Object Detection | 43 | 7.36% |
Instance Segmentation | 21 | 3.60% |
Image Segmentation | 18 | 3.08% |
Medical Image Segmentation | 17 | 2.91% |
Super-Resolution | 16 | 2.74% |
Classification | 12 | 2.05% |
Self-Supervised Learning | 11 | 1.88% |
Component | Type |
|
---|---|---|
🤖 No Components Found | You can add them if they exist; e.g. Mask R-CNN uses RoIAlign |