Class-Aware Contrastive Semi-Supervised Learning

Pseudo-label-based semi-supervised learning (SSL) has achieved great success on raw data utilization. However, its training procedure suffers from confirmation bias due to the noise contained in self-generated artificial labels. Moreover, the model's judgment becomes noisier in real-world applications with extensive out-of-distribution data. To address this issue, we propose a general method named Class-aware Contrastive Semi-Supervised Learning (CCSSL), which is a drop-in helper to improve the pseudo-label quality and enhance the model's robustness in the real-world setting. Rather than treating real-world data as a union set, our method separately handles reliable in-distribution data with class-wise clustering for blending into downstream tasks and noisy out-of-distribution data with image-wise contrastive for better generalization. Furthermore, by applying target re-weighting, we successfully emphasize clean label learning and simultaneously reduce noisy label learning. Despite its simplicity, our proposed CCSSL has significant performance improvements over the state-of-the-art SSL methods on the standard datasets CIFAR100 and STL10. On the real-world dataset Semi-iNat 2021, we improve FixMatch by 9.80% and CoMatch by 3.18%. Code is available https://github.com/TencentYoutuResearch/Classification-SemiCLS.

PDF Abstract CVPR 2022 PDF CVPR 2022 Abstract
Task Dataset Model Metric Name Metric Value Global Rank Result Benchmark
Semi-Supervised Image Classification cifar-100, 10000 Labels CCSSL(FixMatch) Percentage error 19.32 # 1
Semi-Supervised Image Classification CIFAR-100 (10000 Labels, ImageNet-100 Unlabeled) CCSSL Accuracy 71.12 # 2
Semi-Supervised Image Classification CIFAR-100, 2500 Labels CCSSL(FixMatch) Percentage error 24.3 # 1
Semi-Supervised Image Classification CIFAR-100 (250 Labels, ImageNet-100 Unlabeled) CCSSL Accuarcy 56.3 # 1
Semi-Supervised Image Classification CIFAR-100, 400 Labels CCSSL(FixMatch) Percentage error 38.81 # 8
Semi-Supervised Image Classification CIFAR-100 (400 Labels, ImageNet-100 Unlabeled) CCSSL Accuracy 24.53 # 2
Semi-Supervised Image Classification CIFAR-10 (250 Labels, ImageNet-100 Unlabeled) CCSSL Accuracy 67.2 # 2
Semi-Supervised Image Classification CIFAR-10 (4000 Labels, ImageNet-100 Unlabeled) CCSSL Accuracy 88.77 # 2
Image Classification CIFAR-10 (40 Labels, ImageNet-100 Unlabeled) CCSSL Accuarcy 30.89 # 2
Semi-Supervised Image Classification STL-10 (1000 Labels, ImageNet-100 Unlabeled) CCSSL Accuracy 82.0 # 2
Semi-Supervised Image Classification SVHN (1000 Labels, ImageNet-100 Unlabeled) CCSSL Accuracy 88.6 # 2
Semi-Supervised Image Classification SVHN (250 Labels, ImageNet-100 Unlabeled) CCSSL Accuracy 80.39 # 2
Semi-Supervised Image Classification SVHN (40 Labels, ImageNet-100 Unlabeled) CCSSL Accuracy 50.02 # 2

Methods