Search Results for author: Shiyi Han

Found 7 papers, 1 papers with code

Contextualization of ASR with LLM using phonetic retrieval-based augmentation

no code implementations11 Sep 2024 Zhihong Lei, Xingyu Na, MingBin Xu, Ernest Pusateri, Christophe Van Gysel, Yuanyuan Zhang, Shiyi Han, Zhen Huang

Large language models (LLMs) have shown superb capability of modeling multimodal signals including audio and text, allowing the model to generate spoken or textual response given a speech input.

Retrieval speech-recognition +1

Enhancing CTC-based speech recognition with diverse modeling units

no code implementations5 Jun 2024 Shiyi Han, Zhihong Lei, MingBin Xu, Xingyu Na, Zhen Huang

In recent years, the evolution of end-to-end (E2E) automatic speech recognition (ASR) models has been remarkable, largely due to advances in deep learning architectures like transformer.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +1

Conformer-Based Speech Recognition On Extreme Edge-Computing Devices

no code implementations16 Dec 2023 MingBin Xu, Alex Jin, Sicheng Wang, Mu Su, Tim Ng, Henry Mason, Shiyi Han, Zhihong Lei, Yaqiao Deng, Zhen Huang, Mahesh Krishnamoorthy

With increasingly more powerful compute capabilities and resources in today's devices, traditionally compute-intensive automatic speech recognition (ASR) has been moving from the cloud to devices to better protect user privacy.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +2

Personalization of CTC-based End-to-End Speech Recognition Using Pronunciation-Driven Subword Tokenization

no code implementations16 Oct 2023 Zhihong Lei, Ernest Pusateri, Shiyi Han, Leo Liu, MingBin Xu, Tim Ng, Ruchir Travadi, Youyuan Zhang, Mirko Hannemann, Man-Hung Siu, Zhen Huang

Recent advances in deep learning and automatic speech recognition have improved the accuracy of end-to-end speech recognition systems, but recognition of personal content such as contact names remains a challenge.

Automatic Speech Recognition speech-recognition +1

Acoustic Model Fusion for End-to-end Speech Recognition

no code implementations10 Oct 2023 Zhihong Lei, MingBin Xu, Shiyi Han, Leo Liu, Zhen Huang, Tim Ng, Yuanyuan Zhang, Ernest Pusateri, Mirko Hannemann, Yaqiao Deng, Man-Hung Siu

Recent advances in deep learning and automatic speech recognition (ASR) have enabled the end-to-end (E2E) ASR system and boosted the accuracy to a new level.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +5

Neural Diffusion Model for Microscopic Cascade Prediction

1 code implementation21 Dec 2018 Cheng Yang, Maosong Sun, Haoran Liu, Shiyi Han, Zhiyuan Liu, Huanbo Luan

The strong assumptions oversimplify the complex diffusion mechanism and prevent these models from better fitting real-world cascade data.

Social and Information Networks Physics and Society

Cannot find the paper you are looking for? You can Submit a new open access paper.