Robust Speech Recognition
20 papers with code • 0 benchmarks • 3 datasets
These leaderboards are used to track progress in Robust Speech Recognition
We study the capabilities of speech processing systems trained simply to predict large amounts of transcripts of audio on the internet.
On the Aurora 4 task, the very deep CNN achieves a WER of 8. 81%, further 7. 99% with auxiliary feature joint training, and 7. 09% with LSTM-RNN joint decoding.
Deep generative models have achieved great success in unsupervised learning with the ability to capture complex nonlinear relationships between latent generating factors and observations.
Speech enhancement (SE) aims to suppress the additive noise from a noisy speech signal to improve the speech's perceptual quality and intelligibility.
Investigating Generative Adversarial Networks based Speech Dereverberation for Robust Speech Recognition
First, we study the effectiveness of different dereverberation networks (the generator in GAN) and find that LSTM leads a significant improvement as compared with feed-forward DNN and CNN in our dataset.
Unsupervised Speech Domain Adaptation Based on Disentangled Representation Learning for Robust Speech Recognition
The latent variables allow us to convert the domain of speech according to its context and domain representation.
We investigate the potential of stochastic neural networks for learning effective waveform-based acoustic models.
We then propose a revised encoder that better learns short- and long-term speech dynamics with an efficient combination of recurrent and convolutional networks.
Then, for each class, probabilities of this class are used to compute a mean vector, which we refer to as mean soft labels.