Fixing the train-test resolution discrepancy

Data-augmentation is key to the training of neural networks for image classification. This paper first shows that existing augmentations induce a significant discrepancy between the typical size of the objects seen by the classifier at train and test time. We experimentally validate that, for a target test resolution, using a lower train resolution offers better classification at test time. We then propose a simple yet effective and efficient strategy to optimize the classifier performance when the train and test resolutions differ. It involves only a computationally cheap fine-tuning of the network at the test resolution. This enables training strong classifiers using small training images. For instance, we obtain 77.1% top-1 accuracy on ImageNet with a ResNet-50 trained on 128x128 images, and 79.8% with one trained on 224x224 image. In addition, if we use extra training data we get 82.5% with the ResNet-50 train with 224x224 images. Conversely, when training a ResNeXt-101 32x48d pre-trained in weakly-supervised fashion on 940 million public images at resolution 224x224 and further optimizing for test resolution 320x320, we obtain a test top-1 accuracy of 86.4% (top-5: 98.0%) (single-crop). To the best of our knowledge this is the highest ImageNet single-crop, top-1 and top-5 accuracy to date.

PDF Abstract NeurIPS 2019 PDF NeurIPS 2019 Abstract

Results from the Paper


Ranked #2 on Fine-Grained Image Classification on Birdsnap (using extra training data)

     Get a GitHub badge
Task Dataset Model Metric Name Metric Value Global Rank Uses Extra
Training Data
Result Benchmark
Fine-Grained Image Classification Birdsnap FixSENet-154 Accuracy 84.3% # 2
Fine-Grained Image Classification CUB-200-2011 FixSENet-154 Accuracy 88.7 # 12
Image Classification ImageNet FixResNeXt-101 32x48d Top 1 Accuracy 86.4% # 86
Top 5 Accuracy 98.0 # 11
Number of params 829M # 696
Hardware Burden 62G # 1
Image Classification ImageNet FixPNASNet-5 Top 1 Accuracy 83.7% # 224
Top 5 Accuracy 96.8 # 53
Number of params 86.1M # 590
Image Classification ImageNet FixResNet-50 Billion-scale@224 Top 1 Accuracy 82.5% # 314
Top 5 Accuracy 96.6 # 62
Number of params 25.6M # 419
Image Classification ImageNet FixResNet-50 CutMix Top 1 Accuracy 79.8% # 460
Top 5 Accuracy 94.9 # 134
Image Classification ImageNet FixResNet-50 Top 1 Accuracy 79.1% # 490
Top 5 Accuracy 94.6 # 150
Image Classification ImageNet ReaL FixResNeXt-101 32x48d Accuracy 89.73% # 22
Params 829M # 50
Image Classification iNaturalist FixSENet-154 Top 1 Accuracy 75.4 # 4
Fine-Grained Image Classification NABirds FixSENet-154 Accuracy 89.2% # 5
Fine-Grained Image Classification Oxford 102 Flowers FixInceptionResNet-V2 Top-1 Error Rate 4.3% # 3
Accuracy 95.7% # 17
Fine-Grained Image Classification Oxford-IIIT Pets FixSENet-154 Top-1 Error Rate 5.2% # 3
Accuracy 94.8% # 8
Fine-Grained Image Classification Stanford Cars FixSENet-154 Accuracy 94.4% # 31

Methods