ID	resnetv2_101x1_bitm
LR	0.03
Epochs	90
Layers	101
Crop Pct	1.0
Momentum	0.9
Batch Size	4096
Image Size	480
Weight Decay	0.0001
Interpolation	bilinear

ID	resnetv2_101x3_bitm
LR	0.03
Epochs	90
Layers	101
Crop Pct	1.0
Momentum	0.9
Batch Size	4096
Image Size	480
Weight Decay	0.0001
Interpolation	bilinear

ID	resnetv2_152x2_bitm
Crop Pct	1.0
Image Size	480
Interpolation	bilinear

ID	resnetv2_152x4_bitm
Crop Pct	1.0
Image Size	480
Interpolation	bilinear

ID	resnetv2_50x1_bitm
LR	0.03
Epochs	90
Layers	50
Crop Pct	1.0
Momentum	0.9
Batch Size	4096
Image Size	480
Weight Decay	0.0001
Interpolation	bilinear

ID	resnetv2_50x3_bitm
LR	0.03
Epochs	90
Layers	50
Crop Pct	1.0
Momentum	0.9
Batch Size	4096
Image Size	480
Weight Decay	0.0001
Interpolation	bilinear

Big Transfer

rwightman / pytorch-image-models

Last updated on Feb 14, 2021

Parameters 45 Million

Layers 101

File Size 170.00 MB

Training Data JFT-300M, ImageNet

Training Resources Cloud TPUv3-512

Training Time

Training Techniques	SGD with Momentum, Weight Decay, Mixup
Architecture	1x1 Convolution, Bottleneck Residual Block, Group Normalization, Weight Standardization, Convolution, Global Average Pooling, Residual Block, Residual Connection, ReLU, Max Pooling, Softmax
ID	resnetv2_101x1_bitm
LR	0.03
Epochs	90
Layers	101
Crop Pct	1.0
Momentum	0.9
Batch Size	4096
Image Size	480
Weight Decay	0.0001
Interpolation	bilinear
SHOW MORE
SHOW LESS

Parameters 388 Million

Layers 101

File Size 1.48 GB

Training Data JFT-300M, ImageNet

Training Resources Cloud TPUv3-512

Training Time

Training Techniques	SGD with Momentum, Weight Decay, Mixup
Architecture	1x1 Convolution, Bottleneck Residual Block, Group Normalization, Weight Standardization, Convolution, Global Average Pooling, Residual Block, Residual Connection, ReLU, Max Pooling, Softmax
ID	resnetv2_101x3_bitm
LR	0.03
Epochs	90
Layers	101
Crop Pct	1.0
Momentum	0.9
Batch Size	4096
Image Size	480
Weight Decay	0.0001
Interpolation	bilinear
SHOW MORE
SHOW LESS

Parameters 236 Million

File Size 901.68 MB

Training Data JFT-300M, ImageNet

Training Techniques	SGD with Momentum, Weight Decay, Mixup
Architecture	1x1 Convolution, Bottleneck Residual Block, Group Normalization, Weight Standardization, Convolution, Global Average Pooling, Residual Block, Residual Connection, ReLU, Max Pooling, Softmax
ID	resnetv2_152x2_bitm
Crop Pct	1.0
Image Size	480
Interpolation	bilinear
SHOW MORE
SHOW LESS

Parameters 937 Million

FLOPs

File Size 3.57 GB

Training Data JFT-300M, ImageNet

Training Resources Cloud TPUv3-512

Training Time

Training Techniques	SGD with Momentum, Weight Decay, Mixup
Architecture	1x1 Convolution, Bottleneck Residual Block, Group Normalization, Weight Standardization, Convolution, Global Average Pooling, Residual Block, Residual Connection, ReLU, Max Pooling, Softmax
ID	resnetv2_152x4_bitm
Crop Pct	1.0
Image Size	480
Interpolation	bilinear
SHOW MORE
SHOW LESS

Parameters 26 Million

Layers 50

File Size 97.51 MB

Training Data JFT-300M, ImageNet

Training Resources Cloud TPUv3-512

Training Time

Training Techniques	SGD with Momentum, Weight Decay, Mixup
Architecture	1x1 Convolution, Bottleneck Residual Block, Group Normalization, Weight Standardization, Convolution, Global Average Pooling, Residual Block, Residual Connection, ReLU, Max Pooling, Softmax
ID	resnetv2_50x1_bitm
LR	0.03
Epochs	90
Layers	50
Crop Pct	1.0
Momentum	0.9
Batch Size	4096
Image Size	480
Weight Decay	0.0001
Interpolation	bilinear
SHOW MORE
SHOW LESS

Parameters 217 Million

Layers 50

File Size 829.05 MB

Training Data JFT-300M, ImageNet

Training Resources Cloud TPUv3-512

Training Time

Training Techniques	SGD with Momentum, Weight Decay, Mixup
Architecture	1x1 Convolution, Bottleneck Residual Block, Group Normalization, Weight Standardization, Convolution, Global Average Pooling, Residual Block, Residual Connection, ReLU, Max Pooling, Softmax
ID	resnetv2_50x3_bitm
LR	0.03
Epochs	90
Layers	50
Crop Pct	1.0
Momentum	0.9
Batch Size	4096
Image Size	480
Weight Decay	0.0001
Interpolation	bilinear
SHOW MORE
SHOW LESS

README.md

Summary

Big Transfer (BiT) is a type of pretraining recipe that pre-trains on a large supervised source dataset, and fine-tunes the weights on the target task. Models are trained on the JFT-300M dataset. The finetuned models contained in this collection are finetuned on ImageNet.

How do I load this model?

To load a pretrained model:

import timm
m = timm.create_model('resnetv2_50x1_bitm', pretrained=True)
m.eval()

Replace the model name with the variant you want to use, e.g. resnetv2_50x1_bitm. You can find the IDs in the model summaries at the top of this page.

How do I train this model?

You can follow the timm recipe scripts for training a new model afresh.

Citation

@misc{kolesnikov2020big,
      title={Big Transfer (BiT): General Visual Representation Learning}, 
      author={Alexander Kolesnikov and Lucas Beyer and Xiaohua Zhai and Joan Puigcerver and Jessica Yung and Sylvain Gelly and Neil Houlsby},
      year={2020},
      eprint={1912.11370},
      archivePrefix={arXiv},
      primaryClass={cs.CV}
}

Results

Image Classification on ImageNet

MODEL	TOP 1 ACCURACY	TOP 5 ACCURACY
resnetv2_152x4_bitm	84.95%	97.45%
resnetv2_152x2_bitm	84.4%	97.43%
resnetv2_101x3_bitm	84.38%	97.37%
resnetv2_50x3_bitm	83.75%	97.12%
resnetv2_101x1_bitm	82.21%	96.47%
resnetv2_50x1_bitm	80.19%	95.63%