Supervised Contrastive Loss is an alternative loss function to cross entropy that the authors argue can leverage label information more effectively. Clusters of points belonging to the same class are pulled together in embedding space, while simultaneously pushing apart clusters of samples from different classes.
$$ \mathcal{L}^{sup}=\sum_{i=1}^{2N}\mathcal{L}_i^{sup} \label{eqn:total_supervised_loss} $$
$$ \mathcal{L}_i^{sup}=\frac{1}{2N_{\boldsymbol{\tilde{y}}_i}1}\sum_{j=1}^{2N}\mathbf{1}_{i\neq j}\cdot\mathbf{1}_{\boldsymbol{\tilde{y}}_i=\boldsymbol{\tilde{y}}_j}\cdot\log{\frac{\exp{\left(\boldsymbol{z}_i\cdot\boldsymbol{z}_j/\tau\right)}}{\sum_{k=1}^{2N}\mathbf{1}_{i\neq k}\cdot\exp{\left(\boldsymbol{z}_i\cdot\boldsymbol{z}_k/\tau\right)}}} $$
where $N_{\boldsymbol{\tilde{y}}_i}$ is the total number of images in the minibatch that have the same label, $\boldsymbol{\tilde{y}}_i$, as the anchor, $i$. This loss has important properties well suited for supervised learning: (a) generalization to an arbitrary number of positives, (b) contrastive power increases with more negatives.
Source: Supervised Contrastive LearningPaper  Code  Results  Date  Stars 

Task  Papers  Share 

Image Classification  9  12.86% 
Longtail Learning  3  4.29% 
Knowledge Distillation  3  4.29% 
SelfSupervised Learning  3  4.29% 
Semantic Segmentation  2  2.86% 
Relation Extraction  2  2.86% 
Continual Learning  2  2.86% 
Language Modelling  2  2.86% 
Domain Generalization  1  1.43% 
Component  Type 


🤖 No Components Found  You can add them if they exist; e.g. Mask RCNN uses RoIAlign 