High-Likelihood Area Matters --- Rewarding Near-Correct Predictions Under Imbalanced Distributions
Learning from natural datasets poses significant challenges for traditional classification methods based on the cross-entropy objective due to imbalanced class distributions, in particular, the long-tailed class distributions. It is intuitive to assume that the examples from tail classes are harder to learn so that the systems should be uncertain of the prediction, where the low-likelihood area establishes and the systems are driven more actively to predict correctly. However, this assumption is one-sided and could be misleading. We find in practice that the high-likelihood area contains correct predictions for tail classes and it plays a vital role in learning imbalanced class distributions. In light of this finding, we propose the encourage loss, which rewards the system when the examples belonging to tailed classes in the high-likelihood area are correctly predicted. In contrast to traditional methods that focus on predicting right or wrong, the encourage loss puts weights on strengthening the correct predictions. Experiments on the large-scale long-tailed iNaturalist 2018 classification dataset, and the ImageNet-LT benchmark both validate the proposed approach. We further analyze in detail the influence of the encourage loss on diverse data distributions, including both computer vision and natural language processing tasks.
PDF Abstract