Class-Balanced Distillation for Long-Tailed Visual Recognition

12 Apr 2021  ·  Ahmet Iscen, André Araujo, Boqing Gong, Cordelia Schmid ·

Real-world imagery is often characterized by a significant imbalance of the number of images per class, leading to long-tailed distributions. An effective and simple approach to long-tailed visual recognition is to learn feature representations and a classifier separately, with instance and class-balanced sampling, respectively. In this work, we introduce a new framework, by making the key observation that a feature representation learned with instance sampling is far from optimal in a long-tailed setting. Our main contribution is a new training method, referred to as Class-Balanced Distillation (CBD), that leverages knowledge distillation to enhance feature representations. CBD allows the feature representation to evolve in the second training stage, guided by the teacher learned in the first stage. The second stage uses class-balanced sampling, in order to focus on under-represented classes. This framework can naturally accommodate the usage of multiple teachers, unlocking the information from an ensemble of models to enhance recognition capabilities. Our experiments show that the proposed technique consistently outperforms the state of the art on long-tailed recognition benchmarks such as ImageNet-LT, iNaturalist17 and iNaturalist18.

PDF Abstract
Task Dataset Model Metric Name Metric Value Global Rank Result Benchmark
Long-tail Learning ImageNet-LT CBD-ENS (ResNet-50) Top-1 Accuracy 55.6 # 32
Long-tail Learning ImageNet-LT CBD-ENS (ResNet-152) Top-1 Accuracy 57.7 # 22
Long-tail Learning iNaturalist 2018 CBD-ENS (ResNet-101) Top-1 Accuracy 75.3% # 11
Image Classification iNaturalist 2018 CBD-ENS (ResNet-50) Top-1 Accuracy 73.6% # 26
Image Classification iNaturalist 2018 CBD-ENS (ResNet-101) Top-1 Accuracy 75.3% # 21
Long-tail Learning iNaturalist 2018 CBD-ENS (ResNet-50) Top-1 Accuracy 73.6% # 18

Methods