The main objective Semantic Similarity is to measure the distance between the semantic meanings of a pair of words, phrases, sentences, or documents. For example, the word “car” is more similar to “bus” than it is to “cat”. The two main approaches to measuring Semantic Similarity are knowledge-based approaches and corpus-based, distributional methods.
|TREND||DATASET||BEST METHOD||PAPER TITLE||PAPER||CODE||COMPARE|
We demonstrate that large gains on these tasks can be realized by generative pre-training of a language model on a diverse corpus of unlabeled text, followed by discriminative fine-tuning on each specific task.
Ranked #3 on Question Answering on Story Cloze Test
However, it requires that both sentences are fed into the network, which causes a massive computational overhead: Finding the most similar pair in a collection of 10, 000 sentences requires about 50 million inference computations (~65 hours) with BERT.
Ranked #6 on Semantic Textual Similarity on STS Benchmark (Spearman Correlation metric)
We present a novel language representation model enhanced by knowledge called ERNIE (Enhanced Representation through kNowledge IntEgration).
Ranked #2 on Chinese Sentence Pair Classification on LCQMC Dev
CHINESE NAMED ENTITY RECOGNITION CHINESE SENTENCE PAIR CLASSIFICATION CHINESE SENTIMENT ANALYSIS NATURAL LANGUAGE INFERENCE QUESTION ANSWERING SEMANTIC SIMILARITY SEMANTIC TEXTUAL SIMILARITY SENTIMENT ANALYSIS
Because of their superior ability to preserve sequence information over time, Long Short-Term Memory (LSTM) networks, a type of recurrent neural network with a more complex computational unit, have obtained strong results on a variety of sequence modeling tasks.
Ranked #1 on Semantic Similarity on SICK
An HNN consists of two component models, a masked language model and a semantic similarity model, which share a BERT-based contextual encoder but use different model-specific input and output layers.
Ranked #1 on Common Sense Reasoning on Winograd Schema Challenge
Ranked #1 on Semantic Similarity on BIOSSES
DOCUMENT CLASSIFICATION DRUG–DRUG INTERACTION EXTRACTION MEDICAL NAMED ENTITY RECOGNITION MEDICAL RELATION EXTRACTION NATURAL LANGUAGE INFERENCE RELATION EXTRACTION SEMANTIC SIMILARITY TRANSFER LEARNING
Such an embedding does not only improve image retrieval results, but could also facilitate integrating semantics for other tasks, e. g., novelty detection or few-shot learning.
Traditionally, for this problem supervision is expressed in the form of sets of points that follow an ordinal relationship -- an anchor point $x$ is similar to a set of positive points $Y$, and dissimilar to a set of negative points $Z$, and a loss defined over these distances is minimized.