Big Transfer

Last updated on Feb 14, 2021

resnetv2_101x1_bitm

Parameters 45 Million
Layers 101
File Size 170.00 MB
Training Data JFT-300M, ImageNet
Training Resources Cloud TPUv3-512
Training Time

Training Techniques SGD with Momentum, Weight Decay, Mixup
Architecture 1x1 Convolution, Bottleneck Residual Block, Group Normalization, Weight Standardization, Convolution, Global Average Pooling, Residual Block, Residual Connection, ReLU, Max Pooling, Softmax
ID resnetv2_101x1_bitm
LR 0.03
Epochs 90
Layers 101
Crop Pct 1.0
Momentum 0.9
Batch Size 4096
Image Size 480
Weight Decay 0.0001
Interpolation bilinear
SHOW MORE
SHOW LESS
resnetv2_101x3_bitm

Parameters 388 Million
Layers 101
File Size 1.48 GB
Training Data JFT-300M, ImageNet
Training Resources Cloud TPUv3-512
Training Time

Training Techniques SGD with Momentum, Weight Decay, Mixup
Architecture 1x1 Convolution, Bottleneck Residual Block, Group Normalization, Weight Standardization, Convolution, Global Average Pooling, Residual Block, Residual Connection, ReLU, Max Pooling, Softmax
ID resnetv2_101x3_bitm
LR 0.03
Epochs 90
Layers 101
Crop Pct 1.0
Momentum 0.9
Batch Size 4096
Image Size 480
Weight Decay 0.0001
Interpolation bilinear
SHOW MORE
SHOW LESS
resnetv2_152x2_bitm

Parameters 236 Million
File Size 901.68 MB
Training Data JFT-300M, ImageNet

Training Techniques SGD with Momentum, Weight Decay, Mixup
Architecture 1x1 Convolution, Bottleneck Residual Block, Group Normalization, Weight Standardization, Convolution, Global Average Pooling, Residual Block, Residual Connection, ReLU, Max Pooling, Softmax
ID resnetv2_152x2_bitm
Crop Pct 1.0
Image Size 480
Interpolation bilinear
SHOW MORE
SHOW LESS
resnetv2_152x4_bitm

Parameters 937 Million
FLOPs
File Size 3.57 GB
Training Data JFT-300M, ImageNet
Training Resources Cloud TPUv3-512
Training Time

Training Techniques SGD with Momentum, Weight Decay, Mixup
Architecture 1x1 Convolution, Bottleneck Residual Block, Group Normalization, Weight Standardization, Convolution, Global Average Pooling, Residual Block, Residual Connection, ReLU, Max Pooling, Softmax
ID resnetv2_152x4_bitm
Crop Pct 1.0
Image Size 480
Interpolation bilinear
SHOW MORE
SHOW LESS
resnetv2_50x1_bitm

Parameters 26 Million
Layers 50
File Size 97.51 MB
Training Data JFT-300M, ImageNet
Training Resources Cloud TPUv3-512
Training Time

Training Techniques SGD with Momentum, Weight Decay, Mixup
Architecture 1x1 Convolution, Bottleneck Residual Block, Group Normalization, Weight Standardization, Convolution, Global Average Pooling, Residual Block, Residual Connection, ReLU, Max Pooling, Softmax
ID resnetv2_50x1_bitm
LR 0.03
Epochs 90
Layers 50
Crop Pct 1.0
Momentum 0.9
Batch Size 4096
Image Size 480
Weight Decay 0.0001
Interpolation bilinear
SHOW MORE
SHOW LESS
resnetv2_50x3_bitm

Parameters 217 Million
Layers 50
File Size 829.05 MB
Training Data JFT-300M, ImageNet
Training Resources Cloud TPUv3-512
Training Time

Training Techniques SGD with Momentum, Weight Decay, Mixup
Architecture 1x1 Convolution, Bottleneck Residual Block, Group Normalization, Weight Standardization, Convolution, Global Average Pooling, Residual Block, Residual Connection, ReLU, Max Pooling, Softmax
ID resnetv2_50x3_bitm
LR 0.03
Epochs 90
Layers 50
Crop Pct 1.0
Momentum 0.9
Batch Size 4096
Image Size 480
Weight Decay 0.0001
Interpolation bilinear
SHOW MORE
SHOW LESS
README.md

Summary

Big Transfer (BiT) is a type of pretraining recipe that pre-trains on a large supervised source dataset, and fine-tunes the weights on the target task. Models are trained on the JFT-300M dataset. The finetuned models contained in this collection are finetuned on ImageNet.

How do I load this model?

To load a pretrained model:

import timm
m = timm.create_model('resnetv2_50x1_bitm', pretrained=True)
m.eval()

Replace the model name with the variant you want to use, e.g. resnetv2_50x1_bitm. You can find the IDs in the model summaries at the top of this page.

How do I train this model?

You can follow the timm recipe scripts for training a new model afresh.

Citation

@misc{kolesnikov2020big,
      title={Big Transfer (BiT): General Visual Representation Learning}, 
      author={Alexander Kolesnikov and Lucas Beyer and Xiaohua Zhai and Joan Puigcerver and Jessica Yung and Sylvain Gelly and Neil Houlsby},
      year={2020},
      eprint={1912.11370},
      archivePrefix={arXiv},
      primaryClass={cs.CV}
}

Results

Image Classification on ImageNet

Image Classification on ImageNet
MODEL TOP 1 ACCURACY TOP 5 ACCURACY
resnetv2_152x4_bitm 84.95% 97.45%
resnetv2_152x2_bitm 84.4% 97.43%
resnetv2_101x3_bitm 84.38% 97.37%
resnetv2_50x3_bitm 83.75% 97.12%
resnetv2_101x1_bitm 82.21% 96.47%
resnetv2_50x1_bitm 80.19% 95.63%