Because the decoder architecture is the same as an autoregressive LM, it is simple to enhance the model by leveraging external text data with LM training.

Paper
Add Code

Probabilistic Linguistic Knowledge and Token-level Text Augmentation

no code yet • 29 Jun 2023

This paper investigates the effectiveness of token-level text augmentation and the role of probabilistic linguistic knowledge within a linguistically-motivated evaluation context.

Paper
Add Code

Text Generation with Speech Synthesis for ASR Data Augmentation

no code yet • 22 May 2023

In this work, we explore text augmentation for ASR using large-scale pre-trained neural networks, and systematically compare those to traditional text augmentation methods.

Paper
Add Code

Boosting Event Extraction with Denoised Structure-to-Text Augmentation

no code yet • 16 May 2023

Event extraction aims to recognize pre-defined event triggers and arguments from texts, which suffer from the lack of high-quality annotations.

Paper
Add Code

Shuffle & Divide: Contrastive Learning for Long Text

no code yet • 19 Apr 2023

We propose a self-supervised learning method for long text documents based on contrastive learning.

Paper
Add Code

Improving Fast-slow Encoder based Transducer with Streaming Deliberation

no code yet • 15 Dec 2022

Experiments on Librispeech and in-house data show relative WER reductions (WERRs) from 3% to 5% with a slight increase in model size and negligible extra token emission latency compared with fast-slow encoder based transducer.

Paper
Add Code

Enabling Classifiers to Make Judgements Explicitly Aligned with Human Values

no code yet • 14 Oct 2022

Therefore, we introduce a framework for value-aligned classification that performs prediction based on explicitly written human values in the command.

Paper
Add Code

Entity Aware Syntax Tree Based Data Augmentation for Natural Language Understanding

no code yet • 6 Sep 2022

One of the main challenges is to collect a sufficient amount of annotated data to train a model.

Paper
Add Code

Data Augmentation for Low-Resource Quechua ASR Improvement

no code yet • 14 Jul 2022

In this paper we describe our data augmentation approach to improve the results of ASR models for low-resource and agglutinative languages.

Paper
Add Code

Textual Data Augmentation for Arabic-English Code-Switching Speech Recognition

no code yet • 7 Jan 2022

The pervasiveness of intra-utterance code-switching (CS) in spoken content requires that speech recognition (ASR) systems handle mixed language.

Paper
Add Code

Text Augmentation

Benchmarks Add a Result

Libraries

Latest papers with no code

Content

Benchmarks

Add a Result