Text Augmentation

25 papers with code • 0 benchmarks • 0 datasets

You can read these blog posts to get an overview of the approaches.


Use these libraries to find Text Augmentation models and implementations
3 papers
2 papers

Most implemented papers

EDA: Easy Data Augmentation Techniques for Boosting Performance on Text Classification Tasks

jasonwei20/eda_nlp IJCNLP 2019

We present EDA: easy data augmentation techniques for boosting performance on text classification tasks.

Contextual Augmentation: Data Augmentation by Words with Paradigmatic Relations

pfnet-research/contextual_augmentation NAACL 2018

We stochastically replace words with other words that are predicted by a bi-directional language model at the word positions.

Learning to Compose Domain-Specific Transformations for Data Augmentation

HazyResearch/tanda NeurIPS 2017

Data augmentation is a ubiquitous technique for increasing the size of labeled training sets by leveraging task-specific data transformations that preserve class labels.

Sequence-to-Sequence Data Augmentation for Dialogue Language Understanding

AtmaHou/Seq2SeqDataAugmentationForLU COLING 2018

In this paper, we study the problem of data augmentation for language understanding in task-oriented dialogue system.

Text Data Augmentation Made Simple By Leveraging NLP Cloud APIs

ClaudeCoulombe/TextDataAugmentation 5 Dec 2018

In practice, it is common to find oneself with far too little text data to train a deep neural network.

Improving short text classification through global augmentation methods

dsfsi/textaugment 7 Jul 2019

We study the effect of different approaches to text augmentation.

Empirical Study of Text Augmentation on Social Media Text in Vietnamese

sonlam1102/text_augmentation_vietnamese 25 Sep 2020

Thus, when collecting the data about user comments on the social network, the data is usually skewed about one label, which leads the dataset to become imbalanced and deteriorate the model's ability.

Text Augmentation for Language Models in High Error Recognition Scenario

BUTSpeechFIT/BrnoLM 11 Nov 2020

We examine the effect of data augmentation for training of language models for speech recognition.

Better Robustness by More Coverage: Adversarial Training with Mixup Augmentation for Robust Fine-tuning

thunlp/MixADA 31 Dec 2020

In this work, we propose a simple and effective method to cover a much larger proportion of the attack search space, called Adversarial and Mixup Data Augmentation (AMDA).