Semantic Similarity

178 papers with code • 7 benchmarks • 6 datasets

The main objective Semantic Similarity is to measure the distance between the semantic meanings of a pair of words, phrases, sentences, or documents. For example, the word “car” is more similar to “bus” than it is to “cat”. The two main approaches to measuring Semantic Similarity are knowledge-based approaches and corpus-based, distributional methods.

Source: Visual and Semantic Knowledge Transfer for Large Scale Semi-supervised Object Detection

Greatest papers with code

Improving Language Understanding by Generative Pre-Training

huggingface/transformers Preprint 2018

We demonstrate that large gains on these tasks can be realized by generative pre-training of a language model on a diverse corpus of unlabeled text, followed by discriminative fine-tuning on each specific task.

Document Classification Language Modelling +5

Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks

UKPLab/sentence-transformers IJCNLP 2019

However, it requires that both sentences are fed into the network, which causes a massive computational overhead: Finding the most similar pair in a collection of 10, 000 sentences requires about 50 million inference computations (~65 hours) with BERT.

Semantic Similarity Semantic Textual Similarity +2

ERNIE: Enhanced Representation through Knowledge Integration

PaddlePaddle/ERNIE 19 Apr 2019

We present a novel language representation model enhanced by knowledge called ERNIE (Enhanced Representation through kNowledge IntEgration).

Chinese Named Entity Recognition Chinese Sentence Pair Classification +6

Improved Semantic Representations From Tree-Structured Long Short-Term Memory Networks

tensorflow/fold IJCNLP 2015

Because of their superior ability to preserve sequence information over time, Long Short-Term Memory (LSTM) networks, a type of recurrent neural network with a more complex computational unit, have obtained strong results on a variety of sequence modeling tasks.

General Classification Semantic Similarity +1

A Hybrid Neural Network Model for Commonsense Reasoning

namisan/mt-dnn WS 2019

An HNN consists of two component models, a masked language model and a semantic similarity model, which share a BERT-based contextual encoder but use different model-specific input and output layers.

Common Sense Reasoning Language Modelling +2

Top2Vec: Distributed Representations of Topics

ddangelov/Top2Vec 19 Aug 2020

Distributed representations of documents and words have gained popularity due to their ability to capture semantics of words and documents.

Lemmatization Semantic Similarity +2

On the Sentence Embeddings from Pre-trained Language Models

InsaneLife/dssm EMNLP 2020

Pre-trained contextual representations like BERT have achieved great success in natural language processing.

Language Modelling Semantic Similarity +2

Hierarchy-based Image Embeddings for Semantic Image Retrieval

cvjena/semantic-embeddings 26 Sep 2018

Such an embedding does not only improve image retrieval results, but could also facilitate integrating semantics for other tasks, e. g., novelty detection or few-shot learning.

Few-Shot Learning Image Retrieval +2

Portuguese Word Embeddings: Evaluating on Word Analogies and Natural Language Tasks

nathanshartmann/portuguese_word_embeddings WS 2017

Word embeddings have been found to provide meaningful representations for words in an efficient way; therefore, they have become common in Natural Language Processing sys- tems.

Semantic Similarity Semantic Textual Similarity +1