Mitigating Embedding and Class Assignment Mismatch in Unsupervised Image Classification

Unsupervised image classification is a challenging computer vision task. Deep learning-based algorithms have achieved superb results, where the latest approach adopts unified losses from embedding and class assignment processes. Since these processes inherently have different goals, jointly optimizing them may lead to a suboptimal solution. To address this limitation, we propose a novel two-stage algorithm in which an embedding module for pretraining precedes a refining module that concurrently performs embedding and class assignment. Our model outperforms SOTA when tested with multiple datasets, by substantially high accuracy of 81.0% for the CIFAR-10 dataset (i.e., increased by 19.3 percent points), 35.3% accuracy for CIFAR-100-20 (9.6 pp) and 66.5% accuracy for STL-10 (6.9 pp) in unsupervised tasks.

PDF Abstract
Task Dataset Model Metric Name Metric Value Global Rank Uses Extra
Training Data
Benchmark
Unsupervised Image Classification CIFAR-10 TSUC Accuracy 81.0 # 7
Image Clustering CIFAR-10 TSUC Accuracy 0.81 # 21
NMI - # 29
Train set Train # 1
ARI - # 29
Backbone ResNet-18 # 1
Image Clustering CIFAR-100 TSUC Accuracy 0.353 # 19
Unsupervised Image Classification CIFAR-20 TSUC Accuracy 35.3 # 9
Unsupervised Image Classification STL-10 TSUC Accuracy 66.50 # 7
Image Clustering STL-10 TSUC Accuracy 0.665 # 19
Backbone ResNet-18 # 1

Methods


No methods listed for this paper. Add relevant methods here