Attention based on-device streaming speech recognition with large speech corpus

2 Jan 2020Kwangyoun KimKyungmin LeeDhananjaya GowdaJunmo ParkSungsoo KimSichen JinYoung-Yoon LeeJinsu YeoDaehyun KimSeokyeong JungJungin LeeMyoungji HanChanwoo Kim

In this paper, we present a new on-device automatic speech recognition (ASR) system based on monotonic chunk-wise attention (MoChA) models trained with large (> 10K hours) corpus. We attained around 90% of a word recognition rate for general domain mainly by using joint training of connectionist temporal classifier (CTC) and cross entropy (CE) losses, minimum word error rate (MWER) training, layer-wise pre-training and data augmentation methods... (read more)

PDF Abstract

Results from the Paper

  Submit results from this paper to get state-of-the-art GitHub badges and help the community compare results to other papers.

Methods used in the Paper

🤖 No Methods Found Help the community by adding them if they're not listed; e.g. Deep Residual Learning for Image Recognition uses ResNet