BYOL (Bootstrap Your Own Latent) is a new approach to self-supervised learning. BYOL’s goal is to learn a representation $y_θ$ which can then be used for downstream tasks. BYOL uses two neural networks to learn: the online and target networks. The online network is defined by a set of weights $θ$ and is comprised of three stages: an encoder $f_θ$, a projector $g_θ$ and a predictor $q_θ$. The target network has the same architecture as the online network, but uses a different set of weights $ξ$. The target network provides the regression targets to train the online network, and its parameters $ξ$ are an exponential moving average of the online parameters $θ$.
Given the architecture diagram on the right, BYOL minimizes a similarity loss between $q_θ(z_θ)$ and $sg(z'{_ξ})$, where $θ$ are the trained weights, $ξ$ are an exponential moving average of $θ$ and $sg$ means stop-gradient. At the end of training, everything but $f_θ$ is discarded, and $y_θ$ is used as the image representation.
Source: Bootstrap Your Own Latent - A New Approach to Self-Supervised Learning
Image credit: Bootstrap Your Own Latent - A New Approach to Self-Supervised Learning
Source: Bootstrap Your Own Latent - A New Approach to Self-Supervised LearningPaper | Code | Results | Date | Stars |
---|
Task | Papers | Share |
---|---|---|
Self-Supervised Learning | 78 | 33.19% |
Image Classification | 11 | 4.68% |
Semantic Segmentation | 7 | 2.98% |
Object Detection | 6 | 2.55% |
Diversity | 5 | 2.13% |
Action Recognition | 4 | 1.70% |
Pseudo Label | 4 | 1.70% |
Clustering | 4 | 1.70% |
Time Series Analysis | 4 | 1.70% |
Component | Type |
|
---|---|---|
🤖 No Components Found | You can add them if they exist; e.g. Mask R-CNN uses RoIAlign |