This is the task of image classification using representations learnt with self-supervised learning. Self-supervised methods generally involve a pretext task that is solved to learn a good representation and a loss function to learn with. One example of a loss function is an autoencoder based loss where the goal is reconstruction of an image pixel-by-pixel. A more popular recent example is a contrastive loss, which measure the similarity of sample pairs in a representation space, and where there can be a varying target instead of a fixed target to reconstruct (as in the case of autoencoders).
A common evaluation protocol is to train a linear classifier on top of (frozen) representations learnt by self-supervised methods. The leaderboards for the linear evaluation protocol can be found below. In practice, it is more common to fine-tune features on a downstream task. An alternative evaluation protocol therefore uses semi-supervised learning and finetunes on a % of the labels. The leaderboards for the finetuning protocol can be accessed here.
You may want to read some blog posts before reading the papers and checking the leaderboards:
There is also Yann LeCun's talk at AAAI-20 which you can watch here (35:00+).
( Image credit: A Simple Framework for Contrastive Learning of Visual Representations )
|TREND||DATASET||BEST METHOD||PAPER TITLE||PAPER||CODE||COMPARE|
Many recent methods for unsupervised or self-supervised representation learning train feature extractors by maximizing an estimate of the mutual information (MI) between different views of the data.
This enables building a large and consistent dictionary on-the-fly that facilitates contrastive unsupervised learning.
#6 best model for Self-Supervised Image Classification on ImageNet
We analyze key properties of the approach that make it work, finding that the contrastive loss outperforms a popular alternative based on cross-view prediction, and that the more views we learn from, the better the resulting representation captures underlying scene semantics.
#7 best model for Self-Supervised Image Classification on ImageNet
Contrastive unsupervised learning has recently shown encouraging progress, e. g., in Momentum Contrast (MoCo) and SimCLR.
#4 best model for Self-Supervised Image Classification on ImageNet
This paper presents SimCLR: a simple framework for contrastive learning of visual representations.
Unsupervised visual representation learning remains a largely unsolved problem in computer vision research.
The key insight of our model is to learn such representations by predicting the future in latent space by using powerful autoregressive models.
#5 best model for Semi-Supervised Image Classification on ImageNet - 1% labeled data
Following our proposed approach, we develop a model which learns image representations that significantly outperform prior methods on the tasks we consider.
#8 best model for Self-Supervised Image Classification on ImageNet
The goal of self-supervised learning from images is to construct image representations that are semantically meaningful via pretext tasks that do not require semantic annotations for a large training set of images.
#9 best model for Semi-Supervised Image Classification on ImageNet - 10% labeled data
We extensively evaluate the representation learning and generation capabilities of these BigBiGAN models, demonstrating that these generation-based models achieve the state of the art in unsupervised representation learning on ImageNet, as well as in unconditional image generation.
SOTA for Image Generation on ImageNet 64x64 (Inception Score metric )