Paraphrase Identification
75 papers with code • 11 benchmarks • 18 datasets
The goal of Paraphrase Identification is to determine whether a pair of sentences have the same meaning.
Source: Adversarial Examples with Difficult Common Words for Paraphrase Identification
Image source: On Paraphrase Identification Corpora
Libraries
Use these libraries to find Paraphrase Identification models and implementationsMost implemented papers
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding
We introduce a new language representation model called BERT, which stands for Bidirectional Encoder Representations from Transformers.
XLNet: Generalized Autoregressive Pretraining for Language Understanding
With the capability of modeling bidirectional contexts, denoising autoencoding based pretraining like BERT achieves better performance than pretraining approaches based on autoregressive language modeling.
FNet: Mixing Tokens with Fourier Transforms
At longer input lengths, our FNet model is significantly faster: when compared to the "efficient" Transformers on the Long Range Arena benchmark, FNet matches the accuracy of the most accurate models, while outpacing the fastest models across all sequence lengths on GPUs (and across relatively shorter lengths on TPUs).
TinyBERT: Distilling BERT for Natural Language Understanding
To accelerate inference and reduce model size while maintaining accuracy, we first propose a novel Transformer distillation method that is specially designed for knowledge distillation (KD) of the Transformer-based models.
data2vec: A General Framework for Self-supervised Learning in Speech, Vision and Language
While the general idea of self-supervised learning is identical across modalities, the actual algorithms and objectives differ widely because they were developed with a single modality in mind.
Bilateral Multi-Perspective Matching for Natural Language Sentences
Natural language sentence matching is a fundamental technology for a variety of tasks.
Scaling Instruction-Finetuned Language Models
We find that instruction finetuning with the above aspects dramatically improves performance on a variety of model classes (PaLM, T5, U-PaLM), prompting setups (zero-shot, few-shot, CoT), and evaluation benchmarks (MMLU, BBH, TyDiQA, MGSM, open-ended generation).
ABCNN: Attention-Based Convolutional Neural Network for Modeling Sentence Pairs
(ii) We propose three attention schemes that integrate mutual influence between sentences into CNN; thus, the representation of each sentence takes into consideration its counterpart.
Multi-Task Deep Neural Networks for Natural Language Understanding
In this paper, we present a Multi-Task Deep Neural Network (MT-DNN) for learning representations across multiple natural language understanding (NLU) tasks.
SpanBERT: Improving Pre-training by Representing and Predicting Spans
We present SpanBERT, a pre-training method that is designed to better represent and predict spans of text.