Paraphrase Identification

72 papers with code • 10 benchmarks • 17 datasets

The goal of Paraphrase Identification is to determine whether a pair of sentences have the same meaning.

Source: Adversarial Examples with Difficult Common Words for Paraphrase Identification

Image source: On Paraphrase Identification Corpora

Libraries

Use these libraries to find Paraphrase Identification models and implementations

GAPX: Generalized Autoregressive Paraphrase-Identification X

yifeizhou02/generalized_paraphrase_identification 5 Oct 2022

Paraphrase Identification is a fundamental task in Natural Language Processing.

3
05 Oct 2022

Adversarial Self-Attention for Language Understanding

gingasan/adversarialsa 25 Jun 2022

Deep neural models (e. g. Transformer) naturally learn spurious features, which create a ``shortcut'' between the labels and inputs, thus impairing the generalization and robustness.

5
25 Jun 2022

NMTScore: A Multilingual Analysis of Translation-based Text Similarity Measures

zurichnlp/nmtscore 28 Apr 2022

Being able to rank the similarity of short text segments is an interesting bonus feature of neural machine translation.

22
28 Apr 2022

Match-Prompt: Improving Multi-task Generalization Ability for Neural Text Matching via Prompt Learning

xsc1234/match-prompt 6 Apr 2022

In generalization stage, matching model explores the essential matching signals by being trained on diverse matching tasks.

2
06 Apr 2022

data2vec: A General Framework for Self-supervised Learning in Speech, Vision and Language

huggingface/transformers Preprint 2022

While the general idea of self-supervised learning is identical across modalities, the actual algorithms and objectives differ widely because they were developed with a single modality in mind.

125,020
07 Feb 2022

Trans-Encoder: Unsupervised sentence-pair modelling through self- and mutual-distillations

amzn/trans-encoder ICLR 2022

Predominantly, two formulations are used for sentence-pair tasks: bi-encoders and cross-encoders.

131
27 Sep 2021

Towards Better Characterization of Paraphrases

tlkh/paraphrase-metrics ACL ARR September 2021

To effectively characterize the nature of paraphrase pairs without expert human annotation, we proposes two new metrics: word position deviation (WPD) and lexical deviation (LD).

2
17 Sep 2021

Modelling Latent Translations for Cross-Lingual Transfer

McGill-NLP/latent-translation 23 Jul 2021

To remedy this, we propose a new technique that integrates both steps of the traditional pipeline (translation and classification) into a single model, by treating the intermediate translations as a latent random variable.

17
23 Jul 2021

Charformer: Fast Character Transformers via Gradient-based Subword Tokenization

google-research/google-research ICLR 2022

In this paper, we propose a new model inductive bias that learns a subword tokenization end-to-end as part of the model.

32,808
23 Jun 2021

Improving Paraphrase Detection with the Adversarial Paraphrasing Task

Advancing-Machine-Human-Reasoning-Lab/apt ACL 2021

Can we teach them instead to identify paraphrases in a way that draws on the inferential properties of the sentences, and is not over-reliant on lexical and syntactic similarities of a sentence pair?

20
14 Jun 2021