Self-Supervised Image Classification
85 papers with code • 2 benchmarks • 1 datasets
This is the task of image classification using representations learnt with self-supervised learning. Self-supervised methods generally involve a pretext task that is solved to learn a good representation and a loss function to learn with. One example of a loss function is an autoencoder based loss where the goal is reconstruction of an image pixel-by-pixel. A more popular recent example is a contrastive loss, which measure the similarity of sample pairs in a representation space, and where there can be a varying target instead of a fixed target to reconstruct (as in the case of autoencoders).
A common evaluation protocol is to train a linear classifier on top of (frozen) representations learnt by self-supervised methods. The leaderboards for the linear evaluation protocol can be found below. In practice, it is more common to fine-tune features on a downstream task. An alternative evaluation protocol therefore uses semi-supervised learning and finetunes on a % of the labels. The leaderboards for the finetuning protocol can be accessed here.
You may want to read some blog posts before reading the papers and checking the leaderboards:
- Contrastive Self-Supervised Learning - Ankesh Anand
- The Illustrated Self-Supervised Learning - Amit Chaudhary
- Self-supervised learning and computer vision - Jeremy Howard
- Self-Supervised Representation Learning - Lilian Weng
There is also Yann LeCun's talk at AAAI-20 which you can watch here (35:00+).
( Image credit: A Simple Framework for Contrastive Learning of Visual Representations )
Libraries
Use these libraries to find Self-Supervised Image Classification models and implementationsLatest papers
Improving Visual Representation Learning through Perceptual Understanding
We present an extension to masked autoencoders (MAE) which improves on the representations learnt by the model by explicitly encouraging the learning of higher scene-level features.
EVA: Exploring the Limits of Masked Visual Representation Learning at Scale
We launch EVA, a vision-centric foundation model to explore the limits of visual representation at scale using only publicly accessible data.
Towards Sustainable Self-supervised Learning
In this work, we explore a sustainable SSL framework with two major challenges: i) learning a stronger new SSL model based on the existing pretrained SSL model, also called as "base" model, in a cost-friendly manner, ii) allowing the training of the new model to be compatible with various base models.
Exploring Target Representations for Masked Autoencoders
Masked autoencoders have become popular training paradigms for self-supervised visual representation learning.
BEiT v2: Masked Image Modeling with Vector-Quantized Visual Tokenizers
The large-size BEiT v2 obtains 87. 3% top-1 accuracy for ImageNet-1K (224 size) fine-tuning, and 56. 7% mIoU on ADE20K for semantic segmentation.
Model-Aware Contrastive Learning: Towards Escaping the Dilemmas
Contrastive learning (CL) continuously achieves significant breakthroughs across multiple domains.
Bootstrapped Masked Autoencoders for Vision BERT Pretraining
The first design is motivated by the observation that using a pretrained MAE to extract the features as the BERT prediction target for masked tokens can achieve better pretraining performance.
Unsupervised Visual Representation Learning by Synchronous Momentum Grouping
In this paper, we propose a genuine group-level contrastive visual representation learning method whose linear evaluation performance on ImageNet surpasses the vanilla supervised learning.
Architecture-Agnostic Masked Image Modeling -- From ViT back to CNN
We then propose an Architecture-Agnostic Masked Image Modeling framework (A$^2$MIM), which is compatible with both Transformers and CNNs in a unified way.
Multiplexed Immunofluorescence Brain Image Analysis Using Self-Supervised Dual-Loss Adaptive Masked Autoencoder
Reliable large-scale cell detection and segmentation is the fundamental first step to understanding biological processes in the brain.