10 papers with code • 0 benchmarks • 0 datasets
To join the advantages of classical and end-to-end approaches for speech recognition, we present a simple, novel and competitive approach for phoneme-based neural transducer modeling.
For the speech recognition rescoring, although the proposed LSTM LM obtains very slight gains, the new model seems obtain the great complementary with the conventional LSTM LM.
Recurrent Neural Network Language Models (RNNLMs) have started to be used in various fields of speech recognition due to their outstanding performance.
Therefore a Tigrinya large vocabulary continuous speech recognition system often has a large number of different units and a high out-of-vocabulary (OOV) rate if a word is used as a recognition unit of a language model (LM) and lexicon.
We propose a method to reduce false voice triggers of a speech-enabled personal assistant by post-processing the hypothesis lattice of a server-side large-vocabulary continuous speech recognizer (LVCSR) via a neural network.
Self-attention networks (SAN) have been introduced into automatic speech recognition (ASR) and achieved state-of-the-art performance owing to its superior ability in capturing long term dependency.
LSTM based language models are an important part of modern LVCSR systems as they significantly improve performance over traditional backoff language models.