ResNeSt

Last updated on Feb 14, 2021

resnest101e

Parameters 48 Million
FLOPs 17 Billion
File Size 184.81 MB
Training Data ImageNet
Training Resources 64x NVIDIA V100 GPUs
Training Time

Training Techniques SGD with Momentum, Weight Decay, Label Smoothing, AutoAugment, Mixup, DropBlock
Architecture 1x1 Convolution, Convolution, Residual Connection, Split Attention, ReLU, Max Pooling, Global Average Pooling, Dense Connections, Softmax
ID resnest101e
LR 0.1
Epochs 270
Layers 101
Dropout 0.2
Crop Pct 0.875
Momentum 0.9
Batch Size 4096
Image Size 256
Weight Decay 0.0001
Interpolation bilinear
SHOW MORE
SHOW LESS
resnest14d

Parameters 11 Million
FLOPs 4 Billion
File Size 40.59 MB
Training Data ImageNet
Training Resources 64x NVIDIA V100 GPUs
Training Time

Training Techniques SGD with Momentum, Weight Decay, Label Smoothing, AutoAugment, Mixup, DropBlock
Architecture 1x1 Convolution, Convolution, Residual Connection, Split Attention, ReLU, Max Pooling, Global Average Pooling, Dense Connections, Softmax
ID resnest14d
LR 0.1
Epochs 270
Layers 14
Dropout 0.2
Crop Pct 0.875
Momentum 0.9
Batch Size 8192
Image Size 224
Weight Decay 0.0001
Interpolation bilinear
SHOW MORE
SHOW LESS
resnest200e

Parameters 70 Million
FLOPs 46 Billion
File Size 184.81 MB
Training Data ImageNet
Training Resources 64x NVIDIA V100 GPUs
Training Time

Training Techniques SGD with Momentum, Weight Decay, Label Smoothing, AutoAugment, Mixup, DropBlock
Architecture 1x1 Convolution, Convolution, Residual Connection, Split Attention, ReLU, Max Pooling, Global Average Pooling, Dense Connections, Softmax
ID resnest200e
LR 0.1
Epochs 270
Layers 200
Dropout 0.2
Crop Pct 0.909
Momentum 0.9
Batch Size 2048
Image Size 320
Weight Decay 0.0001
Interpolation bicubic
SHOW MORE
SHOW LESS
resnest269e

Parameters 111 Million
FLOPs 101 Billion
File Size 424.77 MB
Training Data ImageNet
Training Resources 64x NVIDIA V100 GPUs
Training Time

Training Techniques SGD with Momentum, Weight Decay, Label Smoothing, AutoAugment, Mixup, DropBlock
Architecture 1x1 Convolution, Convolution, Residual Connection, Split Attention, ReLU, Max Pooling, Global Average Pooling, Dense Connections, Softmax
ID resnest269e
LR 0.1
Epochs 270
Layers 269
Dropout 0.2
Crop Pct 0.928
Momentum 0.9
Batch Size 2048
Image Size 416
Weight Decay 0.0001
Interpolation bicubic
SHOW MORE
SHOW LESS
resnest26d

Parameters 17 Million
FLOPs 5 Billion
File Size 65.30 MB
Training Data ImageNet
Training Resources 64x NVIDIA V100 GPUs
Training Time

Training Techniques SGD with Momentum, Weight Decay, Label Smoothing, AutoAugment, Mixup, DropBlock
Architecture 1x1 Convolution, Convolution, Residual Connection, Split Attention, ReLU, Max Pooling, Global Average Pooling, Dense Connections, Softmax
ID resnest26d
LR 0.1
Epochs 270
Layers 26
Dropout 0.2
Crop Pct 0.875
Momentum 0.9
Batch Size 8192
Image Size 224
Weight Decay 0.0001
Interpolation bilinear
SHOW MORE
SHOW LESS
resnest50d

Parameters 27 Million
FLOPs 7 Billion
File Size 105.16 MB
Training Data ImageNet
Training Resources 64x NVIDIA V100 GPUs
Training Time

Training Techniques SGD with Momentum, Weight Decay, Label Smoothing, AutoAugment, Mixup, DropBlock
Architecture 1x1 Convolution, Convolution, Residual Connection, Split Attention, ReLU, Max Pooling, Global Average Pooling, Dense Connections, Softmax
ID resnest50d
LR 0.1
Epochs 270
Layers 50
Dropout 0.2
Crop Pct 0.875
Momentum 0.9
Batch Size 8192
Image Size 224
Weight Decay 0.0001
Interpolation bilinear
SHOW MORE
SHOW LESS
resnest50d_1s4x24d

Parameters 26 Million
FLOPs 6 Billion
File Size 98.27 MB
Training Data ImageNet
Training Resources 64x NVIDIA V100 GPUs
Training Time

Training Techniques SGD with Momentum, Weight Decay, Label Smoothing, AutoAugment, Mixup, DropBlock
Architecture 1x1 Convolution, Convolution, Residual Connection, Split Attention, ReLU, Max Pooling, Global Average Pooling, Dense Connections, Softmax
ID resnest50d_1s4x24d
LR 0.1
Epochs 270
Layers 50
Dropout 0.2
Crop Pct 0.875
Momentum 0.9
Batch Size 8192
Image Size 224
Weight Decay 0.0001
Interpolation bicubic
SHOW MORE
SHOW LESS
resnest50d_4s2x40d

Parameters 30 Million
FLOPs 6 Billion
File Size 116.48 MB
Training Data ImageNet
Training Resources 64x NVIDIA V100 GPUs
Training Time

Training Techniques SGD with Momentum, Weight Decay, Label Smoothing, AutoAugment, Mixup, DropBlock
Architecture 1x1 Convolution, Convolution, Residual Connection, Split Attention, ReLU, Max Pooling, Global Average Pooling, Dense Connections, Softmax
ID resnest50d_4s2x40d
LR 0.1
Epochs 270
Layers 50
Dropout 0.2
Crop Pct 0.875
Momentum 0.9
Batch Size 8192
Image Size 224
Weight Decay 0.0001
Interpolation bicubic
SHOW MORE
SHOW LESS
README.md

Summary

A ResNest is a variant on a ResNet, which instead stacks Split-Attention blocks. The cardinal group representations are then concatenated along the channel dimension: $V = \text{Concat}${$V^{1},V^{2},\cdots{V}^{K}$}. As in standard residual blocks, the final output $Y$ of otheur Split-Attention block is produced using a shortcut connection: $Y=V+X$, if the input and output feature-map share the same shape. For blocks with a stride, an appropriate transformation $\mathcal{T}$ is applied to the shortcut connection to align the output shapes: $Y=V+\mathcal{T}(X)$. For example, $\mathcal{T}$ can be strided convolution or combined convolution-with-pooling.

How do I load this model?

To load a pretrained model:

import timm
m = timm.create_model('resnest14d', pretrained=True)
m.eval()

Replace the model name with the variant you want to use, e.g. resnest14d. You can find the IDs in the model summaries at the top of this page.

How do I train this model?

You can follow the timm recipe scripts for training a new model afresh.

Citation

@misc{zhang2020resnest,
      title={ResNeSt: Split-Attention Networks}, 
      author={Hang Zhang and Chongruo Wu and Zhongyue Zhang and Yi Zhu and Haibin Lin and Zhi Zhang and Yue Sun and Tong He and Jonas Mueller and R. Manmatha and Mu Li and Alexander Smola},
      year={2020},
      eprint={2004.08955},
      archivePrefix={arXiv},
      primaryClass={cs.CV}
}

Results

Image Classification on ImageNet

Image Classification on ImageNet
MODEL TOP 1 ACCURACY TOP 5 ACCURACY
resnest269e 84.53% 96.99%
resnest200e 83.85% 96.89%
resnest101e 82.88% 96.31%
resnest50d_4s2x40d 81.11% 95.55%
resnest50d_1s4x24d 81.0% 95.33%
resnest50d 80.96% 95.38%
resnest26d 78.48% 94.3%
resnest14d 75.51% 92.52%