Jigsaw

Last updated on Feb 27, 2021

Jigsaw AlexNet (Goyal19, ImageNet-1K)

Parameters 61 Million
FLOPs 715 Million
File Size 8.92 MB
Training Data ImageNet
Training Resources 8 NVIDIA V100 GPUs
Training Time

Training Techniques Jigsaw, Weight Decay, SGD with Momentum
Architecture Convolution, Dropout, Dense Connections, ReLU, Max Pooling, Softmax
ID alexnet_in1k_jigsaw_goyal
Classes 1000
SHOW MORE
SHOW LESS
Jigsaw AlexNet (Goyal19, ImageNet-22K)

Parameters 61 Million
FLOPs 715 Million
File Size 8.92 MB
Training Data ImageNet
Training Resources 8 NVIDIA V100 GPUs
Training Time

Training Techniques Jigsaw, Weight Decay, SGD with Momentum
Architecture Convolution, Dropout, Dense Connections, ReLU, Max Pooling, Softmax
ID alexnet_in22k_jigsaw_goyal
Classes 22000
SHOW MORE
SHOW LESS
Jigsaw AlexNet (Goyal19, YFCC100M)

Parameters 61 Million
FLOPs 715 Million
File Size 8.92 MB
Training Data ImageNet, YFCC100M
Training Resources 8 NVIDIA V100 GPUs
Training Time

Training Techniques Jigsaw, Weight Decay, SGD with Momentum
Architecture Convolution, Dropout, Dense Connections, ReLU, Max Pooling, Softmax
ID alexnet_yfcc100m_jigsaw_goyal
Jigsaw ResNet-50 - 100 permutations

Jigsaw ResNet-50 - 100 permutations achieves 83.3% Top 1 Accuracy on ImageNet


Parameters 26 Million
FLOPs 4 Billion
File Size 97.78 MB
Training Data ImageNet
Training Resources 8 NVIDIA V100 GPUs
Training Time

Training Techniques Jigsaw, Weight Decay, SGD with Momentum
Architecture 1x1 Convolution, Bottleneck Residual Block, Batch Normalization, Convolution, Global Average Pooling, Residual Block, Residual Connection, ReLU, Max Pooling, Softmax
ID rn50_in1k_perm100_jigsaw
Layers 50
Classes 1000
Permutations 100
Width Multiplier 1
SHOW MORE
SHOW LESS
Jigsaw ResNet-50 - 10K permutations

Jigsaw ResNet-50 - 10K permutations achieves 81.9% Top 1 Accuracy on ImageNet


Parameters 26 Million
FLOPs 4 Billion
File Size 882.14 MB
Training Data ImageNet
Training Resources 8 NVIDIA V100 GPUs
Training Time

Training Techniques Jigsaw, Weight Decay, SGD with Momentum
Architecture 1x1 Convolution, Bottleneck Residual Block, Batch Normalization, Convolution, Global Average Pooling, Residual Block, Residual Connection, ReLU, Max Pooling, Softmax
ID rn50_in1k_perm10k_jigsaw
LR 0.1
Layers 50
Classes 1000
Momentum 0.9
Permutations 2000
Width Multiplier 1
SHOW MORE
SHOW LESS
Jigsaw ResNet-50 (Goyal19, ImageNet-1K)

Parameters 26 Million
FLOPs 4 Billion
File Size 273.79 MB
Training Data ImageNet
Training Resources 8 NVIDIA V100 GPUs
Training Time

Training Techniques Jigsaw, Weight Decay, SGD with Momentum
Architecture 1x1 Convolution, Bottleneck Residual Block, Batch Normalization, Convolution, Global Average Pooling, Residual Block, Residual Connection, ReLU, Max Pooling, Softmax
ID rn50_in1k_jigsaw_goyal
LR 0.1
Layers 50
Classes 1000
Momentum 0.9
Permutations 2000
Width Multiplier 1
SHOW MORE
SHOW LESS
Jigsaw ResNet-50 (Goyal19, ImageNet-22K)

Parameters 26 Million
FLOPs 4 Billion
File Size 273.79 MB
Training Data ImageNet
Training Resources 8 NVIDIA V100 GPUs
Training Time

Training Techniques Jigsaw, Weight Decay, SGD with Momentum
Architecture 1x1 Convolution, Bottleneck Residual Block, Batch Normalization, Convolution, Global Average Pooling, Residual Block, Residual Connection, ReLU, Max Pooling, Softmax
ID rn50_in22k_jigsaw_goyal
LR 0.1
Layers 50
Classes 22000
Momentum 0.9
Permutations 2000
Width Multiplier 1
SHOW MORE
SHOW LESS
Jigsaw ResNet-50 (Goyal19, YFCC100M)

Parameters 26 Million
FLOPs 4 Billion
File Size 449.59 MB
Training Data ImageNet, YFCC100M
Training Resources 8 NVIDIA V100 GPUs
Training Time

Training Techniques Jigsaw, Weight Decay, SGD with Momentum
Architecture 1x1 Convolution, Bottleneck Residual Block, Batch Normalization, Convolution, Global Average Pooling, Residual Block, Residual Connection, ReLU, Max Pooling, Softmax
ID rn50_yfcc100m_jigsaw_goyal
LR 0.1
Layers 50
Momentum 0.9
Permutations 2000
Width Multiplier 1
SHOW MORE
SHOW LESS
Jigsaw ResNet-50 (ImageNet-1K, 2K permutations)

Jigsaw ResNet-50 (ImageNet-1K, 2K permutations) achieves 82% Top 1 Accuracy on ImageNet


Parameters 26 Million
FLOPs 4 Billion
File Size 332.77 MB
Training Data ImageNet
Training Resources 8 NVIDIA V100 GPUs
Training Time

Training Techniques Jigsaw, Weight Decay, SGD with Momentum
Architecture 1x1 Convolution, Bottleneck Residual Block, Batch Normalization, Convolution, Global Average Pooling, Residual Block, Residual Connection, ReLU, Max Pooling, Softmax
ID rn50_in1k_perm2k_jigsaw
LR 0.1
Layers 50
Classes 1000
Momentum 0.9
Permutations 2000
Weight Decay 0.0001
Width Multiplier 1
SHOW MORE
SHOW LESS
Jigsaw ResNet-50 (ImageNet-22K, 2K permutations)

Jigsaw ResNet-50 (ImageNet-22K, 2K permutations) achieves 82.9% Top 1 Accuracy on ImageNet


Parameters 26 Million
FLOPs 4 Billion
File Size 333.20 MB
Training Data ImageNet
Training Resources 8 NVIDIA V100 GPUs
Training Time

Training Techniques Jigsaw, Weight Decay, SGD with Momentum
Architecture 1x1 Convolution, Bottleneck Residual Block, Batch Normalization, Convolution, Global Average Pooling, Residual Block, Residual Connection, ReLU, Max Pooling, Softmax
ID rn50_in22k_perm2k_jigsaw
LR 0.1
Layers 50
Classes 22000
Momentum 0.9
Permutations 2000
Width Multiplier 1
SHOW MORE
SHOW LESS
README.md

Summary

Jigsaw is a self-supervision approach that relies on jigsaw-like puzzles as the pretext task in order to learn image representations. This particular set of models includes improved models for Jigsaw that employ:

  • Scaling pre-training data: scaling to 100× more data (YFCC-100M).
  • Scaling model capacity: scaling up to a higher capacity model, ResNet-50, that shows larger improvements as the data size increases.
  • Scaling problem complexity: scaling the ‘hardness’; observing higher capacity models show a larger improvement on ‘harder’ tasks.

How do I train this model?

Get started with VISSL by trying one of the Colab tutorial notebooks.

Citation

@article{DBLP:journals/corr/abs-1905-01235,
  author    = {Priya Goyal and
               Dhruv Mahajan and
               Abhinav Gupta and
               Ishan Misra},
  title     = {Scaling and Benchmarking Self-Supervised Visual Representation Learning},
  journal   = {CoRR},
  volume    = {abs/1905.01235},
  year      = {2019},
  url       = {http://arxiv.org/abs/1905.01235},
  archivePrefix = {arXiv},
  eprint    = {1905.01235},
  timestamp = {Mon, 28 Sep 2020 08:19:37 +0200},
  biburl    = {https://dblp.org/rec/journals/corr/abs-1905-01235.bib},
  bibsource = {dblp computer science bibliography, https://dblp.org}
}
@article{DBLP:journals/corr/NorooziF16,
  author    = {Mehdi Noroozi and
               Paolo Favaro},
  title     = {Unsupervised Learning of Visual Representations by Solving Jigsaw
               Puzzles},
  journal   = {CoRR},
  volume    = {abs/1603.09246},
  year      = {2016},
  url       = {http://arxiv.org/abs/1603.09246},
  archivePrefix = {arXiv},
  eprint    = {1603.09246},
  timestamp = {Mon, 13 Aug 2018 16:49:09 +0200},
  biburl    = {https://dblp.org/rec/journals/corr/NorooziF16.bib},
  bibsource = {dblp computer science bibliography, https://dblp.org}
}
@misc{goyal2021vissl,
  author =       {Priya Goyal and Benjamin Lefaudeux and Mannat Singh and Jeremy Reizenstein and Vinicius Reis and 
                  Min Xu and and Matthew Leavitt and Mathilde Caron and Piotr Bojanowski and Armand Joulin and 
                  Ishan Misra},
  title =        {VISSL},
  howpublished = {\url{https://github.com/facebookresearch/vissl}},
  year =         {2021}
}

Results

Image Classification on ImageNet

Image Classification on ImageNet
MODEL TOP 1 ACCURACY
Jigsaw ResNet-50 (Goyal19, ImageNet-22K) 53.09%
Jigsaw ResNet-50 (Goyal19, YFCC100M) 51.37%
Jigsaw ResNet-50 - 100 permutations 48.57%
Jigsaw ResNet-50 - 10K permutations 48.11%
Jigsaw ResNet-50 (ImageNet-1K, 2K permutations) 46.73%
Jigsaw ResNet-50 (Goyal19, ImageNet-1K) 46.58%
Jigsaw ResNet-50 (ImageNet-22K, 2K permutations) 44.84%
Jigsaw AlexNet (Goyal19, ImageNet-22K) 37.5%
Jigsaw AlexNet (Goyal19, YFCC100M) 37.01%
Jigsaw AlexNet (Goyal19, ImageNet-1K) 34.82%