Semantic Textual Similarity

557 papers with code • 13 benchmarks • 17 datasets

Semantic textual similarity deals with determining how similar two pieces of texts are. This can take the form of assigning a score from 1 to 5. Related tasks are paraphrase or duplicate identification.

Image source: Learning Semantic Textual Similarity from Conversations

Benchmarks

Add a Result

These leaderboards are used to track progress in Semantic Textual Similarity

Dataset	Best Model	Compare
STS Benchmark	MT-DNN-SMART	See all
MRPC	MT-DNN-SMART	See all
MTEB	ST5-XXL	See all
STS13	AnglE-LLaMA-13B	See all
SICK	PromCSE-RoBERTa-large (0.355B)	See all
STS12	PromptEOL+CSE+OPT-13B	See all
STS14	AnglE-LLaMA-13B	See all
STS15	AnglE-LLaMA-13B	See all
STS16	AnglE-LLaMA-13B	See all
SentEval	GenSen	See all
CxC	PromCSE-RoBERTa-large (0.355B)	See all
SICK-R	AnglE-LLaMA-7B	See all
MRPC Dev	Synthesizer (R+V)	See all

Show all 13 benchmarks

Collapse benchmarks

Libraries

Use these libraries to find Semantic Textual Similarity models and implementations

huggingface/transformers

9 papers

124,527

facebookresearch/xformers

3 papers

7,508

facebookresearch/InferSent

3 papers

2,279

namisan/mt-dnn

3 papers

2,198

See all 11 libraries.

Datasets

Subtasks

Most implemented papers

Most implemented Social Latest No code

ERNIE: Enhanced Representation through Knowledge Integration

PaddlePaddle/PaddleNLP • • 19 Apr 2019

We present a novel language representation model enhanced by knowledge called ERNIE (Enhanced Representation through kNowledge IntEgration).

Paper
Code

FNet: Mixing Tokens with Fourier Transforms

google-research/google-research • • NAACL 2022

At longer input lengths, our FNet model is significantly faster: when compared to the "efficient" Transformers on the Long Range Arena benchmark, FNet matches the accuracy of the most accurate models, while outpacing the fastest models across all sequence lengths on GPUs (and across relatively shorter lengths on TPUs).

Paper
Code

Improving Language Understanding by Generative Pre-Training

huggingface/transformers • • Preprint 2018

We demonstrate that large gains on these tasks can be realized by generative pre-training of a language model on a diverse corpus of unlabeled text, followed by discriminative fine-tuning on each specific task.

Paper
Code

Big Bird: Transformers for Longer Sequences

google-research/bigbird • • NeurIPS 2020

To remedy this, we propose, BigBird, a sparse attention mechanism that reduces this quadratic dependency to linear.

Paper
Code

DeBERTa: Decoding-enhanced BERT with Disentangled Attention

microsoft/DeBERTa • • ICLR 2021

Recent progress in pre-trained neural language models has significantly improved the performance of many natural language processing (NLP) tasks.

Paper
Code

TinyBERT: Distilling BERT for Natural Language Understanding

huawei-noah/Pretrained-Language-Model • • Findings of the Association for Computational Linguistics 2020

To accelerate inference and reduce model size while maintaining accuracy, we first propose a novel Transformer distillation method that is specially designed for knowledge distillation (KD) of the Transformer-based models.

Paper
Code

TSDAE: Using Transformer-based Sequential Denoising Auto-Encoder for Unsupervised Sentence Embedding Learning

UKPLab/sentence-transformers • • 14 Apr 2021

Learning sentence embeddings often requires a large amount of labeled data.

Paper
Code

SpanBERT: Improving Pre-training by Representing and Predicting Spans

facebookresearch/SpanBERT • • TACL 2020

We present SpanBERT, a pre-training method that is designed to better represent and predict spans of text.

Paper
Code

SMART: Robust and Efficient Fine-Tuning for Pre-trained Natural Language Models through Principled Regularized Optimization

namisan/mt-dnn • • ACL 2020

However, due to limited data resources from downstream tasks and the extremely large capacity of pre-trained models, aggressive fine-tuning often causes the adapted model to overfit the data of downstream tasks and forget the knowledge of the pre-trained model.

Paper
Code

Language-agnostic BERT Sentence Embedding

FreddeFrallan/Multilingual-CLIP • • ACL 2022

While BERT is an effective method for learning monolingual sentence embeddings for semantic similarity and embedding based transfer learning (Reimers and Gurevych, 2019), BERT based cross-lingual sentence embeddings have yet to be explored.

Paper
Code

Semantic Textual Similarity

Benchmarks Add a Result

Libraries

Datasets

Subtasks

Most implemented papers

Content

Benchmarks

Add a Result