Browse > Natural Language Processing > Semantic Textual Similarity

Semantic Textual Similarity

65 papers with code · Natural Language Processing

Semantic textual similarity deals with determining how similar two pieces of texts are. This can take the form of assigning a score from 1 to 5. Related tasks are paraphrase or duplicate identification.

State-of-the-art leaderboards

Trend Dataset Best Method Paper title Paper Code Compare

Greatest papers with code

Learning General Purpose Distributed Sentence Representations via Large Scale Multi-task Learning

ICLR 2018 facebookresearch/InferSent

A lot of the recent success in natural language processing (NLP) has been driven by distributed vector representations of words trained on large amounts of text in an unsupervised manner. In this work, we present a simple, effective multi-task learning framework for sentence representations that combines the inductive biases of diverse training objectives in a single model.

MULTI-TASK LEARNING NATURAL LANGUAGE INFERENCE PARAPHRASE IDENTIFICATION SEMANTIC TEXTUAL SIMILARITY

Universal Sentence Encoder

29 Mar 2018facebookresearch/InferSent

For both variants, we investigate and report the relationship between model complexity, resource consumption, the availability of transfer task training data, and task performance. We find that transfer learning using sentence embeddings tends to outperform word level transfer.

SEMANTIC TEXTUAL SIMILARITY SENTENCE EMBEDDINGS SENTIMENT ANALYSIS SUBJECTIVITY ANALYSIS TEXT CLASSIFICATION TRANSFER LEARNING WORD EMBEDDINGS

Supervised Learning of Universal Sentence Representations from Natural Language Inference Data

EMNLP 2017 facebookresearch/InferSent

Many modern NLP systems rely on word embeddings, previously trained in an unsupervised manner on large corpora, as base features. Efforts to obtain embeddings for larger chunks of text, such as sentences, have however not been so successful.

CROSS-LINGUAL NATURAL LANGUAGE INFERENCE SEMANTIC TEXTUAL SIMILARITY TRANSFER LEARNING WORD EMBEDDINGS

Improving Language Understanding by Generative Pre-Training

Preprint 2018 openai/finetune-transformer-lm

We demonstrate that large gains on these tasks can be realized by generative pre-training of a language model on a diverse corpus of unlabeled text, followed by discriminative fine-tuning on each specific task. We demonstrate the effectiveness of our approach on a wide range of benchmarks for natural language understanding.

DOCUMENT CLASSIFICATION LANGUAGE MODELLING NATURAL LANGUAGE INFERENCE QUESTION ANSWERING SEMANTIC TEXTUAL SIMILARITY

Neural Network Models for Paraphrase Identification, Semantic Textual Similarity, Natural Language Inference, and Question Answering

COLING 2018 lanwuwei/SPM_toolkit

In this paper, we analyze several neural network designs (and their variations) for sentence pair modeling and compare their performance extensively across eight datasets, including paraphrase identification, semantic textual similarity, natural language inference, and question answering tasks. Although most of these models have claimed state-of-the-art performance, the original papers often reported on only one or two selected datasets.

NATURAL LANGUAGE INFERENCE PARAPHRASE IDENTIFICATION QUESTION ANSWERING SEMANTIC TEXTUAL SIMILARITY SENTENCE PAIR MODELING

Character-based Neural Networks for Sentence Pair Modeling

HLT 2018 lanwuwei/SPM_toolkit

Sentence pair modeling is critical for many NLP tasks, such as paraphrase identification, semantic textual similarity, and natural language inference. Most state-of-the-art neural models for these tasks rely on pretrained word embedding and compose sentence-level semantics in varied ways; however, few works have attempted to verify whether we really need pretrained embeddings in these tasks.

NATURAL LANGUAGE INFERENCE PARAPHRASE IDENTIFICATION SEMANTIC TEXTUAL SIMILARITY SENTENCE PAIR MODELING

Determining Semantic Textual Similarity using Natural Deduction Proofs

EMNLP 2017 mynlp/ccg2lambda

Determining semantic textual similarity is a core research subject in natural language processing. Since vector-based models for sentence representation often use shallow information, capturing accurate semantics is difficult.

SEMANTIC TEXTUAL SIMILARITY

Counter-fitting Word Vectors to Linguistic Constraints

HLT 2016 nmrksic/counter-fitting

In this work, we present a novel counter-fitting method which injects antonymy and synonymy constraints into vector space representations in order to improve the vectors' capability for judging semantic similarity. Applying this method to publicly available pre-trained word vectors leads to a new state of the art performance on the SimLex-999 dataset.

DIALOGUE STATE TRACKING SEMANTIC TEXTUAL SIMILARITY

Portuguese Word Embeddings: Evaluating on Word Analogies and Natural Language Tasks

WS 2017 nathanshartmann/portuguese_word_embeddings

Word embeddings have been found to provide meaningful representations for words in an efficient way; therefore, they have become common in Natural Language Processing sys- tems. In this paper, we evaluated different word embedding models trained on a large Portuguese corpus, including both Brazilian and European variants.

SEMANTIC TEXTUAL SIMILARITY WORD EMBEDDINGS

Label Noise Reduction in Entity Typing by Heterogeneous Partial-Label Embedding

17 Feb 2016shanzhenren/PLE

Current systems of fine-grained entity typing use distant supervision in conjunction with existing knowledge bases to assign categories (type labels) to entity mentions. However, the type labels so obtained from knowledge bases are often noisy (i.e., incorrect for the entity mention's local context).

SEMANTIC TEXTUAL SIMILARITY