Librispeech Transducer Model with Internal Language Model Prior Correction

7 Apr 2021  ·  Albert Zeyer, André Merboldt, Wilfried Michel, Ralf Schlüter, Hermann Ney ·

We present our transducer model on Librispeech. We study variants to include an external language model (LM) with shallow fusion and subtract an estimated internal LM. This is justified by a Bayesian interpretation where the transducer model prior is given by the estimated internal LM. The subtraction of the internal LM gives us over 14% relative improvement over normal shallow fusion. Our transducer has a separate probability distribution for the non-blank labels which allows for easier combination with the external LM, and easier estimation of the internal LM. We additionally take care of including the end-of-sentence (EOS) probability of the external LM in the last blank probability which further improves the performance. All our code and setups are published.

PDF Abstract

Datasets


Results from the Paper


Ranked #25 on Speech Recognition on LibriSpeech test-clean (using extra training data)

     Get a GitHub badge
Task Dataset Model Metric Name Metric Value Global Rank Uses Extra
Training Data
Benchmark
Speech Recognition LibriSpeech test-clean LSTM Transducer Word Error Rate (WER) 2.23 # 25
Speech Recognition LibriSpeech test-other LSTM Transducer Word Error Rate (WER) 5.6 # 28

Methods


No methods listed for this paper. Add relevant methods here