no code implementations • 16 Mar 2023 • Yanzhe Fu, Yueteng Kang, Songjun Cao, Long Ma
In this work, we propose a two-stage knowledge distillation method to solve these two problems: the first step is to make the big and non-streaming teacher model smaller, and the second step is to make it streaming.
Automatic Speech Recognition Automatic Speech Recognition (ASR) +2
no code implementations • 9 Mar 2022 • Yike Zhang, Xiaobing Feng, Yi Liu, Songjun Cao, Long Ma
Automatic speech recognition (ASR) systems used on smart phones or vehicles are usually required to process speech queries from very different domains.
Automatic Speech Recognition Automatic Speech Recognition (ASR) +4
1 code implementation • 22 Feb 2022 • Keqi Deng, Songjun Cao, Yike Zhang, Long Ma, Gaofeng Cheng, Ji Xu, Pengyuan Zhang
Recently, end-to-end automatic speech recognition models based on connectionist temporal classification (CTC) have achieved impressive results, especially when fine-tuned from wav2vec2. 0 models.
Automatic Speech Recognition Automatic Speech Recognition (ASR) +3
no code implementations • 14 Dec 2021 • Keqi Deng, Songjun Cao, Yike Zhang, Long Ma
In our framework, the encoder is initialized with a pretrained AM (wav2vec2. 0).
Automatic Speech Recognition Automatic Speech Recognition (ASR) +2
no code implementations • 15 Sep 2021 • Keqi Deng, Songjun Cao, Long Ma
For the former task, a standard deviation constraint loss (SDC-loss) based end-to-end (E2E) architecture is proposed to identify accents under the same language.
no code implementations • 15 Sep 2021 • Songjun Cao, Yueteng Kang, Yanzhe Fu, Xiaoshuo Xu, Sining Sun, Yike Zhang, Long Ma
Under such a framework, the neural network is usually pre-trained with massive unlabeled data and then fine-tuned with limited labeled data.
Automatic Speech Recognition Automatic Speech Recognition (ASR) +2
no code implementations • 7 Jul 2021 • Songjun Cao, Yike Zhang, Xiaobing Feng, Long Ma
Secondly, a group of geo-specific language models (Geo-LMs) are integrated into our speech recognition system to improve recognition accuracy of long tail and homophone POI.
no code implementations • 1 May 2020 • Baiji Liu, Songjun Cao, Sining Sun, Weibin Zhang, Long Ma
Experiments on AISHELL-1 data show that the proposed model, along with the training strategies, improve the character error rate (CER) of MoChA from 8. 96% to 7. 68% on test set.