End-To-End Speech Recognition

56 papers with code • 0 benchmarks • 3 datasets

This task has no description! Would you like to contribute one?

Greatest papers with code

Deep Speech 2: End-to-End Speech Recognition in English and Mandarin

tensorflow/models 8 Dec 2015

We show that an end-to-end deep learning approach can be used to recognize either English or Mandarin Chinese speech--two vastly different languages.

Accented Speech Recognition End-To-End Speech Recognition +1

SpecAugment: A Simple Data Augmentation Method for Automatic Speech Recognition

mozilla/DeepSpeech 18 Apr 2019

On LibriSpeech, we achieve 6. 8% WER on test-other without the use of a language model, and 5. 8% WER with shallow fusion with a language model.

Data Augmentation End-To-End Speech Recognition +2

TED-LIUM 3: twice as much data and corpus repartition for experiments on speaker adaptation

kaldi-asr/kaldi 12 May 2018

We present the recent development on Automatic Speech Recognition (ASR) systems in comparison with the two previous releases of the TED-LIUM Corpus from 2012 and 2014.

End-To-End Speech Recognition Speech Recognition

End-to-end ASR: from Supervised to Semi-Supervised Learning with Modern Architectures

facebookresearch/wav2letter 19 Nov 2019

We study pseudo-labeling for the semi-supervised training of ResNet, Time-Depth Separable ConvNets, and Transformers for speech recognition, with either CTC or Seq2Seq loss functions.

Ranked #11 on Speech Recognition on LibriSpeech test-other (using extra training data)

End-To-End Speech Recognition Language Modelling +1

Streaming End-to-End ASR based on Blockwise Non-Autoregressive Models

espnet/espnet 20 Jul 2021

Non-autoregressive (NAR) modeling has gained more and more attention in speech processing.

End-To-End Speech Recognition Speech Recognition

SPGISpeech: 5,000 hours of transcribed financial audio for fully formatted end-to-end speech recognition

espnet/espnet 5 Apr 2021

In the English speech-to-text (STT) machine learning task, acoustic models are conventionally trained on uncased Latin characters, and any necessary orthography (such as capitalization, punctuation, and denormalization of non-standard words) is imputed by separate post-processing models.

End-To-End Speech Recognition Speech Recognition

Towards a Competitive End-to-End Speech Recognition for CHiME-6 Dinner Party Transcription

espnet/espnet 22 Apr 2020

To demonstrate this, we use the CHiME-6 Challenge data as an example of challenging environments and noisy conditions of everyday speech.

Data Augmentation End-To-End Speech Recognition +2

Jasper: An End-to-End Convolutional Neural Acoustic Model

osmr/imgclsmob 5 Apr 2019

In this paper, we report state-of-the-art results on LibriSpeech among end-to-end speech recognition models without any external training data.

End-To-End Speech Recognition Language Modelling +1