Training Techniques | Nesterov Accelerated Gradient, Weight Decay |
---|---|
Architecture | 1x1 Convolution, Batch Normalization, Convolution, Grouped Convolution, Global Average Pooling, ResNeXt Block, Residual Connection, ReLU, Max Pooling, Softmax |
ID | ig_resnext101_32x16d |
SHOW MORE |
Training Techniques | Nesterov Accelerated Gradient, Weight Decay |
---|---|
Architecture | 1x1 Convolution, Batch Normalization, Convolution, Grouped Convolution, Global Average Pooling, ResNeXt Block, Residual Connection, ReLU, Max Pooling, Softmax |
ID | ig_resnext101_32x32d |
SHOW MORE |
Training Techniques | Nesterov Accelerated Gradient, Weight Decay |
---|---|
Architecture | 1x1 Convolution, Batch Normalization, Convolution, Grouped Convolution, Global Average Pooling, ResNeXt Block, Residual Connection, ReLU, Max Pooling, Softmax |
ID | ig_resnext101_32x48d |
SHOW MORE |
Training Techniques | Nesterov Accelerated Gradient, Weight Decay |
---|---|
Architecture | 1x1 Convolution, Batch Normalization, Convolution, Grouped Convolution, Global Average Pooling, ResNeXt Block, Residual Connection, ReLU, Max Pooling, Softmax |
ID | ig_resnext101_32x8d |
SHOW MORE |
A ResNeXt repeats a building block that aggregates a set of transformations with the same topology. Compared to a ResNet, it exposes a new dimension, cardinality (the size of the set of transformations) $C$, as an essential factor in addition to the dimensions of depth and width.
This model was trained on billions of Instagram images using thousands of distinct hashtags as labels exhibit excellent transfer learning performance.
Please note the CC-BY-NC 4.0 license on theses weights, non-commercial use only.
To load a pretrained model:
import timm
m = timm.create_model('ig_resnext101_32x8d', pretrained=True)
m.eval()
Replace the model name with the variant you want to use, e.g. ig_resnext101_32x8d
. You can find the IDs in the model summaries at the top of this page.
You can follow the timm recipe scripts for training a new model afresh.
@misc{mahajan2018exploring,
title={Exploring the Limits of Weakly Supervised Pretraining},
author={Dhruv Mahajan and Ross Girshick and Vignesh Ramanathan and Kaiming He and Manohar Paluri and Yixuan Li and Ashwin Bharambe and Laurens van der Maaten},
year={2018},
eprint={1805.00932},
archivePrefix={arXiv},
primaryClass={cs.CV}
}
MODEL | TOP 1 ACCURACY | TOP 5 ACCURACY |
---|---|---|
ig_resnext101_32x48d | 85.42% | 97.58% |
ig_resnext101_32x32d | 85.09% | 97.44% |
ig_resnext101_32x16d | 84.16% | 97.19% |
ig_resnext101_32x8d | 82.7% | 96.64% |