Sentence Similarity
74 papers with code • 1 benchmarks • 1 datasets
Most implemented papers
SentEval: An Evaluation Toolkit for Universal Sentence Representations
We introduce SentEval, a toolkit for evaluating the quality of universal sentence representations.
Calculating the similarity between words and sentences using a lexical database and corpus statistics
To calculate the semantic similarity between words and sentences, the proposed method follows an edge-based approach using a lexical database.
On the Effect of Dropping Layers of Pre-trained Transformer Models
Transformer-based NLP models are trained using hundreds of millions or even billions of parameters, limiting their applicability in computationally constrained environments.
Generating Sentences by Editing Prototypes
We propose a new generative model of sentences that first samples a prototype sentence from the training corpus and then edits it into a new sentence.
Sentence Ordering and Coherence Modeling using Recurrent Neural Networks
Modeling the structure of coherent texts is a key NLP problem.
Macro Grammars and Holistic Triggering for Efficient Semantic Parsing
To learn a semantic parser from denotations, a learning algorithm must search over a combinatorially large space of logical forms for ones consistent with the annotated denotations.
Context Mover's Distance & Barycenters: Optimal Transport of Contexts for Building Representations
We present a framework for building unsupervised representations of entities and their compositions, where each entity is viewed as a probability distribution rather than a vector embedding.
NeuralWarp: Time-Series Similarity with Warping Networks
Research on time-series similarity measures has emphasized the need for elastic methods which align the indices of pairs of time series and a plethora of non-parametric have been proposed for the task.
Domain-Specific Language Model Pretraining for Biomedical Natural Language Processing
In this paper, we challenge this assumption by showing that for domains with abundant unlabeled text, such as biomedicine, pretraining language models from scratch results in substantial gains over continual pretraining of general-domain language models.
Contrastive Learning of Sentence Embeddings from Scratch
Contrastive learning has been the dominant approach to train state-of-the-art sentence embeddings.