Semantic Textual Similarity

385 papers with code • 12 benchmarks • 17 datasets

Semantic textual similarity deals with determining how similar two pieces of texts are. This can take the form of assigning a score from 1 to 5. Related tasks are paraphrase or duplicate identification.

Image source: Learning Semantic Textual Similarity from Conversations


Use these libraries to find Semantic Textual Similarity models and implementations

Most implemented papers

RoBERTa: A Robustly Optimized BERT Pretraining Approach

pytorch/fairseq 26 Jul 2019

Language model pretraining has led to significant performance gains but careful comparison between different approaches is challenging.

Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks

UKPLab/sentence-transformers IJCNLP 2019

However, it requires that both sentences are fed into the network, which causes a massive computational overhead: Finding the most similar pair in a collection of 10, 000 sentences requires about 50 million inference computations (~65 hours) with BERT.

ALBERT: A Lite BERT for Self-supervised Learning of Language Representations

google-research/ALBERT ICLR 2020

Increasing model size when pretraining natural language representations often results in improved performance on downstream tasks.

Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer

google-research/text-to-text-transfer-transformer arXiv 2019

Transfer learning, where a model is first pre-trained on a data-rich task before being fine-tuned on a downstream task, has emerged as a powerful technique in natural language processing (NLP).

DistilBERT, a distilled version of BERT: smaller, faster, cheaper and lighter

huggingface/transformers NeurIPS 2019

As Transfer Learning from large-scale pre-trained models becomes more prevalent in Natural Language Processing (NLP), operating these large models in on-the-edge and/or under constrained computational training or inference budgets remains challenging.

XLNet: Generalized Autoregressive Pretraining for Language Understanding

zihangdai/xlnet NeurIPS 2019

With the capability of modeling bidirectional contexts, denoising autoencoding based pretraining like BERT achieves better performance than pretraining approaches based on autoregressive language modeling.

Supervised Learning of Universal Sentence Representations from Natural Language Inference Data

facebookresearch/InferSent EMNLP 2017

Many modern NLP systems rely on word embeddings, previously trained in an unsupervised manner on large corpora, as base features.

Universal Sentence Encoder

facebookresearch/InferSent 29 Mar 2018

For both variants, we investigate and report the relationship between model complexity, resource consumption, the availability of transfer task training data, and task performance.

SimCSE: Simple Contrastive Learning of Sentence Embeddings

princeton-nlp/SimCSE EMNLP 2021

This paper presents SimCSE, a simple contrastive learning framework that greatly advances state-of-the-art sentence embeddings.

FNet: Mixing Tokens with Fourier Transforms

google-research/google-research NAACL 2022

At longer input lengths, our FNet model is significantly faster: when compared to the "efficient" Transformers on the Long Range Arena benchmark, FNet matches the accuracy of the most accurate models, while outpacing the fastest models across all sequence lengths on GPUs (and across relatively shorter lengths on TPUs).