2 code implementations • 7 Aug 2023 • Xian Shi, Yexin Yang, Zerui Li, Yanni Chen, Zhifu Gao, Shiliang Zhang
It possesses the advantages of AED-based model's accuracy, NAR model's efficiency, and explicit customization capacity of superior performance.
1 code implementation • 19 May 2023 • Keyu An, Xian Shi, Shiliang Zhang
Recently, recurrent neural network transducer (RNN-T) gains increasing popularity due to its natural streaming capability as well as superior performance.
Ranked #9 on Speech Recognition on AISHELL-1
Automatic Speech Recognition Automatic Speech Recognition (ASR)
1 code implementation • 18 May 2023 • Zhifu Gao, Zerui Li, JiaMing Wang, Haoneng Luo, Xian Shi, Mengzhe Chen, Yabin Li, Lingyun Zuo, Zhihao Du, Zhangyu Xiao, Shiliang Zhang
FunASR offers models trained on large-scale industrial corpora and the ability to deploy them in applications.
Ranked #1 on Speech Recognition on WenetSpeech (using extra training data)
no code implementations • 18 May 2023 • Xian Shi, Haoneng Luo, Zhifu Gao, Shiliang Zhang, Zhijie Yan
Estimating confidence scores for recognition results is a classic task in ASR field and of vital importance for kinds of downstream tasks and training strategies.
1 code implementation • 29 Jan 2023 • Xian Shi, Yanni Chen, Shiliang Zhang, Zhijie Yan
Conventional ASR systems use frame-level phoneme posterior to conduct force-alignment~(FA) and provide timestamps, while end-to-end ASR systems especially AED based ones are short of such ability.
no code implementations • 2 May 2022 • Xian Shi, Xun Xu, Wanyue Zhang, Xiatian Zhu, Chuan Sheng Foo, Kui Jia
We also demonstrate the feasibility of a more efficient training strategy.
no code implementations • 18 Jan 2021 • Xian Shi, Xun Xu, Ke Chen, Lile Cai, Chuan Sheng Foo, Kui Jia
Deep learning models are the state-of-the-art methods for semantic point cloud segmentation, the success of which relies on the availability of large-scale annotated datasets.
no code implementations • 17 Nov 2020 • Xiong Wang, Zhuoyuan Yao, Xian Shi, Lei Xie
End-to-end models are favored in automatic speech recognition (ASR) because of its simplified system structure and superior performance.
Automatic Speech Recognition Automatic Speech Recognition (ASR) +2
1 code implementation • 10 Sep 2020 • Jiehong Lin, Xian Shi, Yuan Gao, Ke Chen, Kui Jia
Point set is arguably the most direct approximation of an object or scene surface, yet its practical acquisition often suffers from the shortcoming of being noisy, sparse, and possibly incomplete, which restricts its use for a high-quality surface recovery.
no code implementations • 12 Jul 2020 • Xian Shi, Qiangze Feng, Lei Xie
The paper then presents an overview of the results and system performance in the three tracks.