Paraphrase Identification
72 papers with code • 10 benchmarks • 17 datasets
The goal of Paraphrase Identification is to determine whether a pair of sentences have the same meaning.
Source: Adversarial Examples with Difficult Common Words for Paraphrase Identification
Image source: On Paraphrase Identification Corpora
Libraries
Use these libraries to find Paraphrase Identification models and implementationsLatest papers
Factorising Meaning and Form for Intent-Preserving Paraphrasing
We propose a method for generating paraphrases of English questions that retain the original intent but use a different surface form.
FNet: Mixing Tokens with Fourier Transforms
At longer input lengths, our FNet model is significantly faster: when compared to the "efficient" Transformers on the Long Range Arena benchmark, FNet matches the accuracy of the most accurate models, while outpacing the fastest models across all sequence lengths on GPUs (and across relatively shorter lengths on TPUs).
Entailment as Few-Shot Learner
Large pre-trained language models (LMs) have demonstrated remarkable ability as few-shot learners.
TSDAE: Using Transformer-based Sequential Denoising Auto-Encoder for Unsupervised Sentence Embedding Learning
Learning sentence embeddings often requires a large amount of labeled data.
Explaining Neural Network Predictions on Sentence Pairs via Learning Word-Group Masks
Most existing methods generate post-hoc explanations for neural network models by identifying individual feature attributions or detecting interactions between adjacent features.
Intrinsic Dimensionality Explains the Effectiveness of Language Model Fine-Tuning
Although pretrained language models can be fine-tuned to produce state-of-the-art results for a very wide range of language understanding tasks, the dynamics of this process are not well understood, especially in the low data regime.
RealFormer: Transformer Likes Residual Attention
Transformer is the backbone of modern NLP models.
Self-Explaining Structures Improve NLP Models
The proposed model comes with the following merits: (1) span weights make the model self-explainable and do not require an additional probing model for interpretation; (2) the proposed model is general and can be adapted to any existing deep learning structures in NLP; (3) the weight associated with each text span provides direct importance scores for higher-level text units such as phrases and sentences.
Adversarial Semantic Collisions
We study semantic collisions: texts that are semantically unrelated but judged as similar by NLP models.
PARADE: A New Dataset for Paraphrase Identification Requiring Computer Science Domain Knowledge
We present a new benchmark dataset called PARADE for paraphrase identification that requires specialized domain knowledge.