Generalized Jensen-Shannon Divergence Loss for Learning with Noisy Labels

NeurIPS 2021  ·  Erik Englesson, Hossein Azizpour ·

Prior works have found it beneficial to combine provably noise-robust loss functions e.g., mean absolute error (MAE) with standard categorical loss function e.g. cross entropy (CE) to improve their learnability. Here, we propose to use Jensen-Shannon divergence as a noise-robust loss function and show that it interestingly interpolate between CE and MAE with a controllable mixing parameter. Furthermore, we make a crucial observation that CE exhibit lower consistency around noisy data points. Based on this observation, we adopt a generalized version of the Jensen-Shannon divergence for multiple distributions to encourage consistency around data points. Using this loss function, we show state-of-the-art results on both synthetic (CIFAR), and real-world (e.g., WebVision) noise with varying noise rates.

PDF Abstract NeurIPS 2021 PDF NeurIPS 2021 Abstract

Datasets


Task Dataset Model Metric Name Metric Value Global Rank Result Benchmark
Image Classification mini WebVision 1.0 GJS (ResNet-50) Top-1 Accuracy 79.28 # 17
Top-5 Accuracy 91.22 # 22
ImageNet Top-1 Accuracy 75.50 # 14
ImageNet Top-5 Accuracy 91.27 # 21

Methods


No methods listed for this paper. Add relevant methods here