Class Prototype-based Cleaner for Label Noise Learning

21 Dec 2022  ·  Jingjia Huang, Yuanqi Chen, Jiashi Feng, Xinglong Wu ·

Semi-supervised learning based methods are current SOTA solutions to the noisy-label learning problem, which rely on learning an unsupervised label cleaner first to divide the training samples into a labeled set for clean data and an unlabeled set for noise data. Typically, the cleaner is obtained via fitting a mixture model to the distribution of per-sample training losses. However, the modeling procedure is \emph{class agnostic} and assumes the loss distributions of clean and noise samples are the same across different classes. Unfortunately, in practice, such an assumption does not always hold due to the varying learning difficulty of different classes, thus leading to sub-optimal label noise partition criteria. In this work, we reveal this long-ignored problem and propose a simple yet effective solution, named \textbf{C}lass \textbf{P}rototype-based label noise \textbf{C}leaner (\textbf{CPC}). Unlike previous works treating all the classes equally, CPC fully considers loss distribution heterogeneity and applies class-aware modulation to partition the clean and noise data. CPC takes advantage of loss distribution modeling and intra-class consistency regularization in feature space simultaneously and thus can better distinguish clean and noise labels. We theoretically justify the effectiveness of our method by explaining it from the Expectation-Maximization (EM) framework. Extensive experiments are conducted on the noisy-label benchmarks CIFAR-10, CIFAR-100, Clothing1M and WebVision. The results show that CPC consistently brings about performance improvement across all benchmarks. Codes and pre-trained models will be released at \url{https://github.com/hjjpku/CPC.git}.

PDF Abstract

Results from the Paper


Task Dataset Model Metric Name Metric Value Global Rank Result Benchmark
Image Classification Clothing1M CPC Accuracy 75.40±0.10% # 3
Image Classification mini WebVision 1.0 CPC Top-1 Accuracy 79.63±0.08 # 12
Top-5 Accuracy 93.46±0.10 # 6
ImageNet Top-1 Accuracy 75.75±0.14 # 12
ImageNet Top-5 Accuracy 93.49±0.25 # 4

Methods


No methods listed for this paper. Add relevant methods here