Robust Curriculum Learning: from clean label detection to noisy label self-correction

ICLR 2021 · Tianyi Zhou, Shengjie Wang, Jeff Bilmes ·

Neural nets training can easily overfit to noisy labels and end with poor generalization performance. Existing methods address this problem by (1) filtering out the noisy data and only using the clean data for training or (2) relabeling the noisy data by the model in-training or another model trained on a clean dataset. However, the former strategy ignores the wrongly-labeled data's useful information, while the latter may introduce extra noise if the relabeling quality is poor. In this paper, we propose a smooth transition and interplay between these two strategies as a curriculum that selects training samples by a dynamic criterion. In particular, we start with learning from clean data and then gradually move to learn noisy-labeled data with pseudo labels produced by a time-ensemble of the model and data augmentations. Unlike the instantaneous loss widely used for noise detection, our data selection is based on the dynamics of both the loss and output consistency for each sample across training steps and different data augmentations, resulting in more precise detection of both clean labels and correct pseudo labels. On multiple benchmarks of noisy labels, we show that our curriculum learning strategy can significantly improve the test accuracy without any auxiliary model or extra clean data.

PDF Abstract