Class-Balanced Loss Based on Effective Number of Samples

With the rapid increase of large-scale, real-world datasets, it becomes critical to address the problem of long-tailed data distribution (i.e., a few classes account for most of the data, while most classes are under-represented). Existing solutions typically adopt class re-balancing strategies such as re-sampling and re-weighting based on the number of observations for each class. In this work, we argue that as the number of samples increases, the additional benefit of a newly added data point will diminish. We introduce a novel theoretical framework to measure data overlap by associating with each sample a small neighboring region rather than a single point. The effective number of samples is defined as the volume of samples and can be calculated by a simple formula $(1-\beta^{n})/(1-\beta)$, where $n$ is the number of samples and $\beta \in [0,1)$ is a hyperparameter. We design a re-weighting scheme that uses the effective number of samples for each class to re-balance the loss, thereby yielding a class-balanced loss. Comprehensive experiments are conducted on artificially induced long-tailed CIFAR datasets and large-scale datasets including ImageNet and iNaturalist. Our results show that when trained with the proposed class-balanced loss, the network is able to achieve significant performance gains on long-tailed datasets.

PDF Abstract CVPR 2019 PDF CVPR 2019 Abstract

Results from the Paper


Task Dataset Model Metric Name Metric Value Global Rank Result Benchmark
Long-tail Learning CIFAR-100-LT (ρ=100) Cross-Entropy (CE) Error Rate 61.68 # 63
Long-tail Learning CIFAR-10-LT (ρ=10) Class-balanced Focal Loss Error Rate 12.90 # 45
Long-tail Learning CIFAR-10-LT (ρ=10) Class-balanced Reweighting Error Rate 13.46 # 48
Long-tail Learning COCO-MLT CB Loss(ResNet-50) Average mAP 49.06 # 9
Long-tail Learning EGTEA CB Loss Average Precision 63.39 # 2
Average Recall 63.26 # 2
Image Classification iNaturalist 2018 ResNet-152 Top-1 Accuracy 69.05% # 36
Image Classification iNaturalist 2018 ResNet-101 Top-1 Accuracy 67.98% # 39
Image Classification iNaturalist 2018 ResNet-50 Top-1 Accuracy 64.16% # 47
Long-tail Learning VOC-MLT CB Focal(ResNet-50) Average mAP 75.24 # 9

Methods


No methods listed for this paper. Add relevant methods here