TASK	DATASET	MODEL	METRIC NAME	METRIC VALUE	GLOBAL RANK	EXTRA DATA	REMOVE
Speech Recognition	LibriSpeech test-clean	CTC + Transformer LM rescoring	Word Error Rate (WER)	2.10	# 21
Speech Recognition	LibriSpeech test-other	CTC + Transformer LM rescoring	Word Error Rate (WER)	4.20	# 17

Badge	Markdown
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/fast-simpler-and-more-accurate-hybrid-asr/speech-recognition-on-librispeech-test-other)](https://paperswithcode.com/sota/speech-recognition-on-librispeech-test-other?p=fast-simpler-and-more-accurate-hybrid-asr)`
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/fast-simpler-and-more-accurate-hybrid-asr/speech-recognition-on-librispeech-test-clean)](https://paperswithcode.com/sota/speech-recognition-on-librispeech-test-clean?p=fast-simpler-and-more-accurate-hybrid-asr)`

Faster, Simpler and More Accurate Hybrid ASR Systems Using Wordpieces

19 May 2020 · Frank Zhang, Yongqiang Wang, Xiaohui Zhang, Chunxi Liu, Yatharth Saraf, Geoffrey Zweig ·

In this work, we first show that on the widely used LibriSpeech benchmark, our transformer-based context-dependent connectionist temporal classification (CTC) system produces state-of-the-art results. We then show that using wordpieces as modeling units combined with CTC training, we can greatly simplify the engineering pipeline compared to conventional frame-based cross-entropy training by excluding all the GMM bootstrapping, decision tree building and force alignment steps, while still achieving very competitive word-error-rate. Additionally, using wordpieces as modeling units can significantly improve runtime efficiency since we can use larger stride without losing accuracy. We further confirm these findings on two internal VideoASR datasets: German, which is similar to English as a fusional language, and Turkish, which is an agglutinative language.

PDF Abstract

Code

Add Remove Mark official

No code implementations yet. Submit your code now

Tasks

Add Remove

Speech Recognition

Datasets

LibriSpeech

Results from the Paper

Edit

Ranked #17 on Speech Recognition on LibriSpeech test-other (using extra training data)

Get a GitHub badge

Task	Dataset	Model	Metric Name	Metric Value	Global Rank	Uses Extra Training Data	Result	Benchmark
Speech Recognition	LibriSpeech test-clean	CTC + Transformer LM rescoring	Word Error Rate (WER)	2.10	# 21			Compare
Speech Recognition	LibriSpeech test-other	CTC + Transformer LM rescoring	Word Error Rate (WER)	4.20	# 17			Compare

Methods

Add Remove

No methods listed for this paper. Add relevant methods here

Edit Social Preview

Faster, Simpler and More Accurate Hybrid ASR Systems Using Wordpieces

Code Edit Add Remove Mark official

Tasks Edit Add Remove

Datasets Edit

Results from the Paper Edit

Methods Edit Add Remove

Code

Add Remove Mark official

Tasks

Add Remove

Datasets

Results from the Paper

Edit

Methods

Add Remove