7 papers with code • 2 benchmarks • 2 datasets
We introduce a new language representation model called BERT, which stands for Bidirectional Encoder Representations from Transformers.
However, it requires that both sentences are fed into the network, which causes a massive computational overhead: Finding the most similar pair in a collection of 10, 000 sentences requires about 50 million inference computations (~65 hours) with BERT.
This paper presents SimCSE, a simple contrastive learning framework that greatly advances state-of-the-art sentence embeddings.
Inspired by recent advances in deep metric learning (DML), we carefully design a self-supervised objective for learning universal sentence embeddings that does not require labelled training data.
Similarly to text embeddings, we train code embedding models on (text, code) pairs, obtaining a 20. 8% relative improvement over prior best work on code search.
DINO-MC: Self-supervised Contrastive Learning for Remote Sensing Imagery with Multi-sized Local Crops
Due to the costly nature of remote sensing image labeling and the large volume of available unlabeled imagery, self-supervised methods that can learn feature representations without manual annotation have received great attention.