The PyTorch-Kaldi Speech Recognition Toolkit

19 Nov 2018Mirco RavanelliTitouan ParcolletYoshua Bengio

The availability of open-source software is playing a remarkable role in the popularization of speech recognition and deep learning. Kaldi, for instance, is nowadays an established framework used to develop state-of-the-art speech recognizers... (read more)

PDF Abstract

Evaluation Results from the Paper


TASK DATASET MODEL METRIC NAME METRIC VALUE GLOBAL RANK COMPARE
Noisy Speech Recognition CHiME real Li-GRU Percentage error 14.6 # 2
Distant Speech Recognition DIRHA English WSJ Li-GRU Word Error Rate (WER) 23.9 # 1
Speech Recognition TIMIT GRU Percentage error 16.6 # 8
Speech Recognition TIMIT LiGRU + Dropout + BatchNorm + Monophone Reg Percentage error 14.2 # 1
Speech Recognition TIMIT LSTM + Dropout + BatchNorm + Monophone Reg Percentage error 14.5 # 2
Speech Recognition TIMIT GRU + Dropout + BatchNorm + Monophone Reg Percentage error 14.9 # 3
Speech Recognition TIMIT Li-GRU Percentage error 16.3 # 6
Speech Recognition TIMIT LSTM Percentage error 16.0 # 5
Speech Recognition TIMIT RNN Percentage error 16.5 # 7
Speech Recognition TIMIT RNN + Dropout + BatchNorm + Monophone Reg Percentage error 15.9 # 4