Training Techniques | SGD with Momentum, Weight Decay, Label Smoothing, AutoAugment, Mixup, DropBlock |
---|---|
Architecture | 1x1 Convolution, Convolution, Residual Connection, Split Attention, ReLU, Max Pooling, Global Average Pooling, Dense Connections, Softmax |
ID | resnest101e |
SHOW MORE |
Training Techniques | SGD with Momentum, Weight Decay, Label Smoothing, AutoAugment, Mixup, DropBlock |
---|---|
Architecture | 1x1 Convolution, Convolution, Residual Connection, Split Attention, ReLU, Max Pooling, Global Average Pooling, Dense Connections, Softmax |
ID | resnest14d |
SHOW MORE |
Training Techniques | SGD with Momentum, Weight Decay, Label Smoothing, AutoAugment, Mixup, DropBlock |
---|---|
Architecture | 1x1 Convolution, Convolution, Residual Connection, Split Attention, ReLU, Max Pooling, Global Average Pooling, Dense Connections, Softmax |
ID | resnest200e |
SHOW MORE |
Training Techniques | SGD with Momentum, Weight Decay, Label Smoothing, AutoAugment, Mixup, DropBlock |
---|---|
Architecture | 1x1 Convolution, Convolution, Residual Connection, Split Attention, ReLU, Max Pooling, Global Average Pooling, Dense Connections, Softmax |
ID | resnest269e |
SHOW MORE |
Training Techniques | SGD with Momentum, Weight Decay, Label Smoothing, AutoAugment, Mixup, DropBlock |
---|---|
Architecture | 1x1 Convolution, Convolution, Residual Connection, Split Attention, ReLU, Max Pooling, Global Average Pooling, Dense Connections, Softmax |
ID | resnest26d |
SHOW MORE |
Training Techniques | SGD with Momentum, Weight Decay, Label Smoothing, AutoAugment, Mixup, DropBlock |
---|---|
Architecture | 1x1 Convolution, Convolution, Residual Connection, Split Attention, ReLU, Max Pooling, Global Average Pooling, Dense Connections, Softmax |
ID | resnest50d |
SHOW MORE |
Training Techniques | SGD with Momentum, Weight Decay, Label Smoothing, AutoAugment, Mixup, DropBlock |
---|---|
Architecture | 1x1 Convolution, Convolution, Residual Connection, Split Attention, ReLU, Max Pooling, Global Average Pooling, Dense Connections, Softmax |
ID | resnest50d_1s4x24d |
SHOW MORE |
Training Techniques | SGD with Momentum, Weight Decay, Label Smoothing, AutoAugment, Mixup, DropBlock |
---|---|
Architecture | 1x1 Convolution, Convolution, Residual Connection, Split Attention, ReLU, Max Pooling, Global Average Pooling, Dense Connections, Softmax |
ID | resnest50d_4s2x40d |
SHOW MORE |
A ResNest is a variant on a ResNet, which instead stacks Split-Attention blocks. The cardinal group representations are then concatenated along the channel dimension: $V = \text{Concat}${$V^{1},V^{2},\cdots{V}^{K}$}. As in standard residual blocks, the final output $Y$ of otheur Split-Attention block is produced using a shortcut connection: $Y=V+X$, if the input and output feature-map share the same shape. For blocks with a stride, an appropriate transformation $\mathcal{T}$ is applied to the shortcut connection to align the output shapes: $Y=V+\mathcal{T}(X)$. For example, $\mathcal{T}$ can be strided convolution or combined convolution-with-pooling.
To load a pretrained model:
import timm
m = timm.create_model('resnest14d', pretrained=True)
m.eval()
Replace the model name with the variant you want to use, e.g. resnest14d
. You can find the IDs in the model summaries at the top of this page.
You can follow the timm recipe scripts for training a new model afresh.
@misc{zhang2020resnest,
title={ResNeSt: Split-Attention Networks},
author={Hang Zhang and Chongruo Wu and Zhongyue Zhang and Yi Zhu and Haibin Lin and Zhi Zhang and Yue Sun and Tong He and Jonas Mueller and R. Manmatha and Mu Li and Alexander Smola},
year={2020},
eprint={2004.08955},
archivePrefix={arXiv},
primaryClass={cs.CV}
}
MODEL | TOP 1 ACCURACY | TOP 5 ACCURACY |
---|---|---|
resnest269e | 84.53% | 96.99% |
resnest200e | 83.85% | 96.89% |
resnest101e | 82.88% | 96.31% |
resnest50d_4s2x40d | 81.11% | 95.55% |
resnest50d_1s4x24d | 81.0% | 95.33% |
resnest50d | 80.96% | 95.38% |
resnest26d | 78.48% | 94.3% |
resnest14d | 75.51% | 92.52% |