|TREND||DATASET||BEST METHOD||PAPER TITLE||PAPER||CODE||COMPARE|
To the best knowledge of the authors, the results obtained when training on the full LibriSpeech training set, are the best published currently, both for the hybrid DNN/HMM and the attention-based systems.
Recently, end-to-end sequence-to-sequence models for speech recognition have gained significant interest in the research community.
On LibriSpeech, we achieve 6. 8% WER on test-other without the use of a language model, and 5. 8% WER with shallow fusion with a language model.
#3 best model for Speech Recognition on LibriSpeech test-clean (using extra training data)
In this paper, we report state-of-the-art results on LibriSpeech among end-to-end speech recognition models without any external training data.
#5 best model for Speech Recognition on LibriSpeech test-other
The success of self-attention in NLP has led to recent applications in end-to-end encoder-decoder architectures for speech recognition.
We present Optimal Completion Distillation (OCD), a training procedure for optimizing sequence to sequence models based on edit distance.
We present our work on end-to-end training of acoustic models using the lattice-free maximum mutual information (LF-MMI) objective function in the context of hidden Markov models.
In this paper, we study end-to-end systems trained directly from the raw waveform, building on two alternatives for trainable replacements of mel-filterbanks that use a convolutional architecture.