FlexMatch: Boosting Semi-Supervised Learning with Curriculum Pseudo Labeling

The recently proposed FixMatch achieved state-of-the-art results on most semi-supervised learning (SSL) benchmarks. However, like other modern SSL algorithms, FixMatch uses a pre-defined constant threshold for all classes to select unlabeled data that contribute to the training, thus failing to consider different learning status and learning difficulties of different classes. To address this issue, we propose Curriculum Pseudo Labeling (CPL), a curriculum learning approach to leverage unlabeled data according to the model's learning status. The core of CPL is to flexibly adjust thresholds for different classes at each time step to let pass informative unlabeled data and their pseudo labels. CPL does not introduce additional parameters or computations (forward or backward propagation). We apply CPL to FixMatch and call our improved algorithm FlexMatch. FlexMatch achieves state-of-the-art performance on a variety of SSL benchmarks, with especially strong performances when the labeled data are extremely limited or when the task is challenging. For example, FlexMatch achieves 13.96% and 18.96% error rate reduction over FixMatch on CIFAR-100 and STL-10 datasets respectively, when there are only 4 labels per class. CPL also significantly boosts the convergence speed, e.g., FlexMatch can use only 1/5 training time of FixMatch to achieve even better performance. Furthermore, we show that CPL can be easily adapted to other SSL algorithms and remarkably improve their performances. We open-source our code at https://github.com/TorchSSL/TorchSSL.

PDF Abstract NeurIPS 2021 PDF NeurIPS 2021 Abstract
Task Dataset Model Metric Name Metric Value Global Rank Result Benchmark
Semi-Supervised Image Classification cifar-100, 10000 Labels FlexMatch Percentage error 21.90±0.15 # 10
Semi-Supervised Image Classification CIFAR-100, 2500 Labels FlexMatch Percentage error 26.49±0.20 # 6
Semi-Supervised Image Classification CIFAR-100, 400 Labels FlexMatch Percentage error 39.94±1.62 # 6
Semi-Supervised Image Classification CIFAR-10, 250 Labels FlexMatch Percentage error 4.8±0.06 # 5
Semi-Supervised Image Classification CIFAR-10, 4000 Labels FlexMatch Percentage error 4.19±0.01 # 11
Semi-Supervised Image Classification CIFAR-10, 40 Labels FlexMatch Percentage error 4.99±0.16 # 3
Semi-Supervised Image Classification ImageNet - 10% labeled data FlexMatch Top 5 Accuracy 86.04% # 27
Top 1 Accuracy 64.79% # 31