no code implementations • 26 Jul 2023 • Tian-Hao Zhang, Dinghao Zhou, Guiping Zhong, Jiaming Zhou, Baoxiang Li
RNN-T models are widely used in ASR, which rely on the RNN-T loss to achieve length alignment between input audio and target sequence.
no code implementations • 24 May 2023 • Zhi-Hao Lai, Tian-Hao Zhang, Qi Liu, Xinyuan Qian, Li-Fang Wei, Song-Lu Chen, Feng Chen, Xu-Cheng Yin
To address these issues, this paper proposes InterFormer for interactive local and global features fusion to learn a better representation for ASR.
Automatic Speech Recognition Automatic Speech Recognition (ASR) +1
no code implementations • 23 May 2023 • Tian-Hao Zhang, Hai-Bo Qin, Zhi-Hao Lai, Song-Lu Chen, Qi Liu, Feng Chen, Xinyuan Qian, Xu-Cheng Yin
The experimental results show that ASCD significantly improves the performance by leveraging both the acoustic and semantic information cooperatively.
no code implementations • 14 Sep 2021 • Chuan-Fei Zhang, Yan Liu, Tian-Hao Zhang, Song-Lu Chen, Feng Chen, Xu-Cheng Yin
To tackle the above problems, we propose a new non-autoregressive transformer with a unified bidirectional decoder (NAT-UBD), which can simultaneously utilize left-to-right and right-to-left contexts.
Automatic Speech Recognition Automatic Speech Recognition (ASR) +1