SSR: An Efficient and Robust Framework for Learning with Unknown Label Noise

22 Nov 2021  ยท  Chen Feng, Georgios Tzimiropoulos, Ioannis Patras ยท

Despite the large progress in supervised learning with neural networks, there are significant challenges in obtaining high-quality, large-scale and accurately labelled datasets. In such a context, how to learn in the presence of noisy labels has received more and more attention. As a relatively complex problem, in order to achieve good results, current approaches often integrate components from several fields, such as supervised learning, semi-supervised learning, transfer learning and resulting in complicated methods. Furthermore, they often make multiple assumptions about the type of noise of the data. This affects the model robustness and limits its performance under different noise conditions. In this paper, we consider a novel problem setting, Learning with Unknown Label Noise}(LULN), that is, learning when both the degree and the type of noise are unknown. Under this setting, unlike previous methods that often introduce multiple assumptions and lead to complex solutions, we propose a simple, efficient and robust framework named Sample Selection and Relabelling(SSR), that with a minimal number of hyperparameters achieves SOTA results in various conditions. At the heart of our method is a sample selection and relabelling mechanism based on a non-parametric KNN classifier~(NPK) $g_q$ and a parametric model classifier~(PMC) $g_p$, respectively, to select the clean samples and gradually relabel the noisy samples. Without bells and whistles, such as model co-training, self-supervised pre-training and semi-supervised learning, and with robustness concerning the settings of its few hyper-parameters, our method significantly surpasses previous methods on both CIFAR10/CIFAR100 with synthetic noise and real-world noisy datasets such as WebVision, Clothing1M and ANIMAL-10N. Code is available at https://github.com/MrChenFeng/SSR_BMVC2022.

PDF Abstract
Task Dataset Model Metric Name Metric Value Global Rank Uses Extra
Training Data
Result Benchmark
Learning with noisy labels ANIMAL SSR Accuracy 88.5 # 3
Network Vgg19-BN # 1
ImageNet Pretrained NO # 1
Image Classification CIFAR-10 (with noisy labels) SSR Accuracy (under 20% Sym. label noise) 96.74% # 1
Accuracy (under 50% Sym. label noise) 96.13% # 2
Accuracy (under 80% Sym. label noise) 95.56% # 1
Accuracy (under 90% Sym. label noise) 95.17% # 1
Image Classification Clothing1M SSR Accuracy 74.91 # 11
Image Classification mini WebVision 1.0 SSR Top-1 Accuracy 80.92 # 4
Top-5 Accuracy 92.80 # 9
ImageNet Top-1 Accuracy 75.76 # 11
ImageNet Top-5 Accuracy 91.76 # 18

Methods


No methods listed for this paper. Add relevant methods here