Paraphrase Identification
72 papers with code • 11 benchmarks • 18 datasets
The goal of Paraphrase Identification is to determine whether a pair of sentences have the same meaning.
Source: Adversarial Examples with Difficult Common Words for Paraphrase Identification
Image source: On Paraphrase Identification Corpora
Libraries
Use these libraries to find Paraphrase Identification models and implementationsMost implemented papers
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding
We introduce a new language representation model called BERT, which stands for Bidirectional Encoder Representations from Transformers.
XLNet: Generalized Autoregressive Pretraining for Language Understanding
With the capability of modeling bidirectional contexts, denoising autoencoding based pretraining like BERT achieves better performance than pretraining approaches based on autoregressive language modeling.
FNet: Mixing Tokens with Fourier Transforms
At longer input lengths, our FNet model is significantly faster: when compared to the "efficient" Transformers on the Long Range Arena benchmark, FNet matches the accuracy of the most accurate models, while outpacing the fastest models across all sequence lengths on GPUs (and across relatively shorter lengths on TPUs).
Bilateral Multi-Perspective Matching for Natural Language Sentences
Natural language sentence matching is a fundamental technology for a variety of tasks.
ABCNN: Attention-Based Convolutional Neural Network for Modeling Sentence Pairs
(ii) We propose three attention schemes that integrate mutual influence between sentences into CNN; thus, the representation of each sentence takes into consideration its counterpart.
Multi-Task Deep Neural Networks for Natural Language Understanding
In this paper, we present a Multi-Task Deep Neural Network (MT-DNN) for learning representations across multiple natural language understanding (NLU) tasks.
data2vec: A General Framework for Self-supervised Learning in Speech, Vision and Language
While the general idea of self-supervised learning is identical across modalities, the actual algorithms and objectives differ widely because they were developed with a single modality in mind.
TinyBERT: Distilling BERT for Natural Language Understanding
To accelerate inference and reduce model size while maintaining accuracy, we first propose a novel Transformer distillation method that is specially designed for knowledge distillation (KD) of the Transformer-based models.
SpanBERT: Improving Pre-training by Representing and Predicting Spans
We present SpanBERT, a pre-training method that is designed to better represent and predict spans of text.
SMART: Robust and Efficient Fine-Tuning for Pre-trained Natural Language Models through Principled Regularized Optimization
However, due to limited data resources from downstream tasks and the extremely large capacity of pre-trained models, aggressive fine-tuning often causes the adapted model to overfit the data of downstream tasks and forget the knowledge of the pre-trained model.