TASK	DATASET	MODEL	METRIC NAME	METRIC VALUE	GLOBAL RANK
Noisy Speech Recognition	CHiME real	Li-GRU	Percentage error	14.6	# 3
Distant Speech Recognition	DIRHA English WSJ	Li-GRU	Word Error Rate (WER)	23.9	# 1
Speech Recognition	LibriSpeech test-clean	Li-GRU	Word Error Rate (WER)	6.2	# 51
Speech Recognition	TIMIT	Li-GRU	Percentage error	16.3	# 10
Speech Recognition	TIMIT	GRU + Dropout + BatchNorm + Monophone Reg	Percentage error	14.9	# 6
Speech Recognition	TIMIT	LSTM + Dropout + BatchNorm + Monophone Reg	Percentage error	14.5	# 4
Speech Recognition	TIMIT	LSTM	Percentage error	16.0	# 9
Speech Recognition	TIMIT	RNN	Percentage error	16.5	# 11
Speech Recognition	TIMIT	GRU	Percentage error	16.6	# 13
Speech Recognition	TIMIT	RNN + Dropout + BatchNorm + Monophone Reg	Percentage error	15.9	# 8
Speech Recognition	TIMIT	LiGRU + Dropout + BatchNorm + Monophone Reg	Percentage error	14.2	# 3

Badge	Markdown
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/the-pytorch-kaldi-speech-recognition-toolkit/distant-speech-recognition-on-dirha-english)](https://paperswithcode.com/sota/distant-speech-recognition-on-dirha-english?p=the-pytorch-kaldi-speech-recognition-toolkit)`
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/the-pytorch-kaldi-speech-recognition-toolkit/noisy-speech-recognition-on-chime-real)](https://paperswithcode.com/sota/noisy-speech-recognition-on-chime-real?p=the-pytorch-kaldi-speech-recognition-toolkit)`
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/the-pytorch-kaldi-speech-recognition-toolkit/speech-recognition-on-timit)](https://paperswithcode.com/sota/speech-recognition-on-timit?p=the-pytorch-kaldi-speech-recognition-toolkit)`
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/the-pytorch-kaldi-speech-recognition-toolkit/speech-recognition-on-librispeech-test-clean)](https://paperswithcode.com/sota/speech-recognition-on-librispeech-test-clean?p=the-pytorch-kaldi-speech-recognition-toolkit)`

The PyTorch-Kaldi Speech Recognition Toolkit

19 Nov 2018 · Mirco Ravanelli, Titouan Parcollet, Yoshua Bengio ·

The availability of open-source software is playing a remarkable role in the popularization of speech recognition and deep learning. Kaldi, for instance, is nowadays an established framework used to develop state-of-the-art speech recognizers. PyTorch is used to build neural networks with the Python language and has recently spawn tremendous interest within the machine learning community thanks to its simplicity and flexibility. The PyTorch-Kaldi project aims to bridge the gap between these popular toolkits, trying to inherit the efficiency of Kaldi and the flexibility of PyTorch. PyTorch-Kaldi is not only a simple interface between these software, but it embeds several useful features for developing modern speech recognizers. For instance, the code is specifically designed to naturally plug-in user-defined acoustic models. As an alternative, users can exploit several pre-implemented neural networks that can be customized using intuitive configuration files. PyTorch-Kaldi supports multiple feature and label streams as well as combinations of neural networks, enabling the use of complex neural architectures. The toolkit is publicly-released along with a rich documentation and is designed to properly work locally or on HPC clusters. Experiments, that are conducted on several datasets and tasks, show that PyTorch-Kaldi can effectively be used to develop modern state-of-the-art speech recognizers.

PDF Abstract

Code

Add Remove Mark official

mravanelli/pytorch-kaldi official

2,353

wponghiran/imp-snns-for-sl

NOEPG/pytorch-kaldi

xpz123/pytorch-kaldi

Baileyswu/pytorch-hmm-vae

See all 11 implementations

Tasks

Add Remove

Distant Speech Recognition

Noisy Speech Recognition

Speech Recognition

Datasets

LibriSpeech

TIMIT

DIRHA

Results from the Paper

Edit

Ranked #1 on Distant Speech Recognition on DIRHA English WSJ

Get a GitHub badge

Task	Dataset	Model	Metric Name	Metric Value	Global Rank	Benchmark
Noisy Speech Recognition	CHiME real	Li-GRU	Percentage error	14.6	# 3	Compare
Distant Speech Recognition	DIRHA English WSJ	Li-GRU	Word Error Rate (WER)	23.9	# 1	Compare
Speech Recognition	LibriSpeech test-clean	Li-GRU	Word Error Rate (WER)	6.2	# 51	Compare
Speech Recognition	TIMIT	Li-GRU	Percentage error	16.3	# 10	Compare
Speech Recognition	TIMIT	GRU + Dropout + BatchNorm + Monophone Reg	Percentage error	14.9	# 6	Compare
Speech Recognition	TIMIT	LSTM + Dropout + BatchNorm + Monophone Reg	Percentage error	14.5	# 4	Compare
Speech Recognition	TIMIT	LSTM	Percentage error	16.0	# 9	Compare
Speech Recognition	TIMIT	RNN	Percentage error	16.5	# 11	Compare
Speech Recognition	TIMIT	GRU	Percentage error	16.6	# 13	Compare
Speech Recognition	TIMIT	RNN + Dropout + BatchNorm + Monophone Reg	Percentage error	15.9	# 8	Compare
Speech Recognition	TIMIT	LiGRU + Dropout + BatchNorm + Monophone Reg	Percentage error	14.2	# 3	Compare

Methods

Add Remove

No methods listed for this paper. Add relevant methods here

Edit Social Preview

The PyTorch-Kaldi Speech Recognition Toolkit

Code Edit Add Remove Mark official

Tasks Edit Add Remove

Datasets Edit

Results from the Paper Edit

Methods Edit Add Remove

Code

Add Remove Mark official

Tasks

Add Remove

Datasets

Results from the Paper

Edit

Methods

Add Remove