Search Results for author: Xian Shi

Found 10 papers, 5 papers with code

SeACo-Paraformer: A Non-Autoregressive ASR System with Flexible and Effective Hotword Customization Ability

2 code implementations • 7 Aug 2023 • Xian Shi, Yexin Yang, Zerui Li, Yanni Chen, Zhifu Gao, Shiliang Zhang

It possesses the advantages of AED-based model's accuracy, NAR model's efficiency, and explicit customization capacity of superior performance.

3,299

Paper
Code

BAT: Boundary aware transducer for memory-efficient and low-latency ASR

1 code implementation • 19 May 2023 • Keyu An, Xian Shi, Shiliang Zhang

Recently, recurrent neural network transducer (RNN-T) gains increasing popularity due to its natural streaming capability as well as superior performance.

Ranked #9 on Speech Recognition on AISHELL-1

Automatic Speech Recognition Automatic Speech Recognition (ASR)

3,299

Paper
Code

FunASR: A Fundamental End-to-End Speech Recognition Toolkit

1 code implementation • 18 May 2023 • Zhifu Gao, Zerui Li, JiaMing Wang, Haoneng Luo, Xian Shi, Mengzhe Chen, Yabin Li, Lingyun Zuo, Zhihao Du, Zhangyu Xiao, Shiliang Zhang

FunASR offers models trained on large-scale industrial corpora and the ability to deploy them in applications.

Ranked #1 on Speech Recognition on WenetSpeech (using extra training data)

Action Detection Activity Detection +2

3,299

Paper
Code

Accurate and Reliable Confidence Estimation Based on Non-Autoregressive End-to-End Speech Recognition System

no code implementations • 18 May 2023 • Xian Shi, Haoneng Luo, Zhifu Gao, Shiliang Zhang, Zhijie Yan

Estimating confidence scores for recognition results is a classic task in ASR field and of vital importance for kinds of downstream tasks and training strategies.

speech-recognition Speech Recognition

Paper
Add Code

Achieving Timestamp Prediction While Recognizing with Non-Autoregressive End-to-End ASR Model

1 code implementation • 29 Jan 2023 • Xian Shi, Yanni Chen, Shiliang Zhang, Zhijie Yan

Conventional ASR systems use frame-level phoneme posterior to conduct force-alignment~(FA) and provide timestamps, while end-to-end ASR systems especially AED based ones are short of such ability.

3,299

Paper
Code

Open-Set Semi-Supervised Learning for 3D Point Cloud Understanding

no code implementations • 2 May 2022 • Xian Shi, Xun Xu, Wanyue Zhang, Xiatian Zhu, Chuan Sheng Foo, Kui Jia

We also demonstrate the feasibility of a more efficient training strategy.

3D Point Cloud Classification Point Cloud Classification

Paper
Add Code

Label-Efficient Point Cloud Semantic Segmentation: An Active Learning Approach

no code implementations • 18 Jan 2021 • Xian Shi, Xun Xu, Ke Chen, Lile Cai, Chuan Sheng Foo, Kui Jia

Deep learning models are the state-of-the-art methods for semantic point cloud segmentation, the success of which relies on the availability of large-scale annotated datasets.

Active Learning Benchmarking +3

Paper
Add Code

Cascade RNN-Transducer: Syllable Based Streaming On-device Mandarin Speech Recognition with a Syllable-to-Character Converter

no code implementations • 17 Nov 2020 • Xiong Wang, Zhuoyuan Yao, Xian Shi, Lei Xie

End-to-end models are favored in automatic speech recognition (ASR) because of its simplified system structure and superior performance.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +2

Paper
Add Code

CAD-PU: A Curvature-Adaptive Deep Learning Solution for Point Set Upsampling

1 code implementation • 10 Sep 2020 • Jiehong Lin, Xian Shi, Yuan Gao, Ke Chen, Kui Jia

Point set is arguably the most direct approximation of an object or scene surface, yet its practical acquisition often suffers from the shortcoming of being noisy, sparse, and possibly incomplete, which restricts its use for a high-quality surface recovery.

Point Set Upsampling

Paper
Code

The ASRU 2019 Mandarin-English Code-Switching Speech Recognition Challenge: Open Datasets, Tracks, Methods and Results

no code implementations • 12 Jul 2020 • Xian Shi, Qiangze Feng, Lei Xie

The paper then presents an overview of the results and system performance in the three tracks.

Data Augmentation Language Identification +2

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.