no code implementations • 29 Mar 2022 • Jingyu Sun, Guiping Zhong, Dinghao Zhou, Baoxiang Li
In order to improve the performance of the streaming model and reduce the computational complexity, a frame-level model using efficient augment memory transformer block and dynamic latency training method is employed for streaming automatic speech recognition in this paper.
Automatic Speech Recognition Automatic Speech Recognition (ASR) +1
no code implementations • 2 Apr 2021 • Lu Huang, Jingyu Sun, Yufeng Tang, JunFeng Hou, Jinkun Chen, Jun Zhang, Zejun Ma
This work describes an encoder pre-training procedure using frame-wise label to improve the training of streaming recurrent neural network transducer (RNN-T) model.