no code implementations • 11 Sep 2024 • Zhihong Lei, Xingyu Na, MingBin Xu, Ernest Pusateri, Christophe Van Gysel, Yuanyuan Zhang, Shiyi Han, Zhen Huang
Large language models (LLMs) have shown superb capability of modeling multimodal signals including audio and text, allowing the model to generate spoken or textual response given a speech input.
no code implementations • 5 Jun 2024 • Shiyi Han, Zhihong Lei, MingBin Xu, Xingyu Na, Zhen Huang
In recent years, the evolution of end-to-end (E2E) automatic speech recognition (ASR) models has been remarkable, largely due to advances in deep learning architectures like transformer.
Automatic Speech Recognition Automatic Speech Recognition (ASR) +1
no code implementations • 16 Dec 2023 • MingBin Xu, Alex Jin, Sicheng Wang, Mu Su, Tim Ng, Henry Mason, Shiyi Han, Zhihong Lei, Yaqiao Deng, Zhen Huang, Mahesh Krishnamoorthy
With increasingly more powerful compute capabilities and resources in today's devices, traditionally compute-intensive automatic speech recognition (ASR) has been moving from the cloud to devices to better protect user privacy.
Automatic Speech Recognition Automatic Speech Recognition (ASR) +2
no code implementations • 16 Oct 2023 • Zhihong Lei, Ernest Pusateri, Shiyi Han, Leo Liu, MingBin Xu, Tim Ng, Ruchir Travadi, Youyuan Zhang, Mirko Hannemann, Man-Hung Siu, Zhen Huang
Recent advances in deep learning and automatic speech recognition have improved the accuracy of end-to-end speech recognition systems, but recognition of personal content such as contact names remains a challenge.
no code implementations • 10 Oct 2023 • Zhihong Lei, MingBin Xu, Shiyi Han, Leo Liu, Zhen Huang, Tim Ng, Yuanyuan Zhang, Ernest Pusateri, Mirko Hannemann, Yaqiao Deng, Man-Hung Siu
Recent advances in deep learning and automatic speech recognition (ASR) have enabled the end-to-end (E2E) ASR system and boosted the accuracy to a new level.
Automatic Speech Recognition Automatic Speech Recognition (ASR) +5
no code implementations • 18 Jul 2022 • MingBin Xu, Congzheng Song, Ye Tian, Neha Agrawal, Filip Granqvist, Rogier Van Dalen, Xiao Zhang, Arturo Argueta, Shiyi Han, Yaqiao Deng, Leo Liu, Anmol Walia, Alex Jin
Our goal is to train a large neural network language model (NNLM) on compute-constrained devices while preserving privacy using FL and DP.
1 code implementation • 21 Dec 2018 • Cheng Yang, Maosong Sun, Haoran Liu, Shiyi Han, Zhiyuan Liu, Huanbo Luan
The strong assumptions oversimplify the complex diffusion mechanism and prevent these models from better fitting real-world cascade data.
Social and Information Networks Physics and Society