Building state-of-the-art distant speech recognition using the CHiME-4 challenge with a setup of speech enhancement baseline

27 Mar 2018Szu-Jui ChenAswin Shanmugam SubramanianHainan XuShinji Watanabe

This paper describes a new baseline system for automatic speech recognition (ASR) in the CHiME-4 challenge to promote the development of noisy ASR in speech processing communities by providing 1) state-of-the-art system with a simplified single system comparable to the complicated top systems in the challenge, 2) publicly available and reproducible recipe through the main repository in the Kaldi speech recognition toolkit. The proposed system adopts generalized eigenvalue beamforming with bidirectional long short-term memory (LSTM) mask estimation... (read more)

PDF Abstract

Evaluation results from the paper

Task Dataset Model Metric name Metric value Global rank Compare
Distant Speech Recognition CHiME-4 real 6ch HMM-TDNN(LFMMI) + LSTMLM + NN-GEV Word Error Rate (WER) 2.74 # 1
Noisy Speech Recognition CHiME real HMM-TDNN(LFMMI) + LSTMLM Percentage error 11.4 # 1