Search Results for author: Baoxiang Li

Found 3 papers, 0 papers with code

CIF-T: A Novel CIF-based Transducer Architecture for Automatic Speech Recognition

no code implementations26 Jul 2023 Tian-Hao Zhang, Dinghao Zhou, Guiping Zhong, Jiaming Zhou, Baoxiang Li

RNN-T models are widely used in ASR, which rely on the RNN-T loss to achieve length alignment between input audio and target sequence.

Automatic Speech Recognition speech-recognition +1

A Polyphone BERT for Polyphone Disambiguation in Mandarin Chinese

no code implementations1 Jul 2022 Song Zhang, Ken Zheng, Xiaoxu Zhu, Baoxiang Li

Grapheme-to-phoneme (G2P) conversion is an indispensable part of the Chinese Mandarin text-to-speech (TTS) system, and the core of G2P conversion is to solve the problem of polyphone disambiguation, which is to pick up the correct pronunciation for several candidates for a Chinese polyphonic character.

Polyphone disambiguation

Dynamic Latency for CTC-Based Streaming Automatic Speech Recognition With Emformer

no code implementations29 Mar 2022 Jingyu Sun, Guiping Zhong, Dinghao Zhou, Baoxiang Li

In order to improve the performance of the streaming model and reduce the computational complexity, a frame-level model using efficient augment memory transformer block and dynamic latency training method is employed for streaming automatic speech recognition in this paper.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +1

Cannot find the paper you are looking for? You can Submit a new open access paper.