Neural Network Language Modeling with Letter-based Features and Importance Sampling

ICASSP 2018 Hainan XuKe LiYiming WangJian WangShiyin KangXie ChenDaniel PoveySanjeev Khudanpur

In this paper we describe an extension of the Kaldi software toolkit to support neural-based language modeling, intended for use in automatic speech recognition (ASR) and related tasks. We combine the use of subword features (letter n-grams) and one-hot encoding of frequent words so that the models can handle large vocabularies containing infrequent words... (read more)


Evaluation results from the paper

Task Dataset Model Metric name Metric value Global rank Compare
Speech Recognition LibriSpeech test-other tdnn + chain + rnnlm rescoring Word Error Rate (WER) 7.63 # 3