About

Semantic textual similarity deals with determining how similar two pieces of texts are. This can take the form of assigning a score from 1 to 5. Related tasks are paraphrase or duplicate identification.

Benchmarks

TREND DATASET BEST METHOD PAPER TITLE PAPER CODE COMPARE

Subtasks

Datasets

Greatest papers with code

DistilBERT, a distilled version of BERT: smaller, faster, cheaper and lighter

NeurIPS 2019 huggingface/transformers

As Transfer Learning from large-scale pre-trained models becomes more prevalent in Natural Language Processing (NLP), operating these large models in on-the-edge and/or under constrained computational training or inference budgets remains challenging.

KNOWLEDGE DISTILLATION LANGUAGE MODELLING LINGUISTIC ACCEPTABILITY NATURAL LANGUAGE INFERENCE QUESTION ANSWERING SEMANTIC TEXTUAL SIMILARITY SENTIMENT ANALYSIS TRANSFER LEARNING

XLNet: Generalized Autoregressive Pretraining for Language Understanding

NeurIPS 2019 huggingface/transformers

With the capability of modeling bidirectional contexts, denoising autoencoding based pretraining like BERT achieves better performance than pretraining approaches based on autoregressive language modeling.

DOCUMENT RANKING HUMOR DETECTION LANGUAGE MODELLING NATURAL LANGUAGE INFERENCE PARAPHRASE IDENTIFICATION QUESTION ANSWERING READING COMPREHENSION SENTIMENT ANALYSIS TEXT CLASSIFICATION

Improving Language Understanding by Generative Pre-Training

Preprint 2018 huggingface/transformers

We demonstrate that large gains on these tasks can be realized by generative pre-training of a language model on a diverse corpus of unlabeled text, followed by discriminative fine-tuning on each specific task.

DOCUMENT CLASSIFICATION LANGUAGE MODELLING NATURAL LANGUAGE INFERENCE NATURAL LANGUAGE UNDERSTANDING QUESTION ANSWERING SEMANTIC SIMILARITY SEMANTIC TEXTUAL SIMILARITY

Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks

IJCNLP 2019 UKPLab/sentence-transformers

However, it requires that both sentences are fed into the network, which causes a massive computational overhead: Finding the most similar pair in a collection of 10, 000 sentences requires about 50 million inference computations (~65 hours) with BERT.

Ranked #6 on Semantic Textual Similarity on STS Benchmark (Spearman Correlation metric)

SEMANTIC SIMILARITY SEMANTIC TEXTUAL SIMILARITY SENTENCE EMBEDDINGS TRANSFER LEARNING