no code implementations • 17 Sep 2024 • Yufeng Yang, Desh Raj, Ju Lin, Niko Moritz, Junteng Jia, Gil Keren, Egor Lakomkin, Yiteng Huang, Jacob Donley, Jay Mahadeokar, Ozlem Kalinli
For the conversational ASR task in particular, using only 8 hours of labeled speech, our model outperforms a supervised ASR baseline that is trained on 2000 hours of labeled data, which demonstrates the effectiveness of our approach.
1 code implementation • 22 Aug 2024 • Max J. L. Lee, Ju Lin, Li-Ta Hsu
We propose a feasibility study for real-time automated data standardization leveraging Large Language Models (LLMs) to enhance seamless positioning systems in IoT environments.
no code implementations • 18 Jan 2024 • Ju Lin, Niko Moritz, Yiteng Huang, Ruiming Xie, Ming Sun, Christian Fuegen, Frank Seide
Wearable devices like smart glasses are approaching the compute capability to seamlessly generate real-time closed captions for live conversations.
Automatic Speech Recognition
Automatic Speech Recognition (ASR)
+1
no code implementations • 22 Jul 2023 • Suyoun Kim, Akshat Shrivastava, Duc Le, Ju Lin, Ozlem Kalinli, Michael L. Seltzer
End-to-end (E2E) spoken language understanding (SLU) systems that generate a semantic parse from speech have become more promising recently.
no code implementations • 7 Nov 2022 • Roshan Sharma, Weipeng He, Ju Lin, Egor Lakomkin, Yang Liu, Kaustubh Kalgaonkar
In this paper, we first demonstrate that egocentric visual information is helpful for noise suppression.
1 code implementation • 5 May 2021 • Jianxin Gao, Ju Lin, Irfan Kil, Ravikiran B. Singapogu, Richard E. Groff
The CRNN was implemented in real time on commodity hardware for use in the cannulation simulator, and the performance was verified.
no code implementations • 24 Feb 2021 • Ju Lin, Adriaan J. van Wijngaarden, Kuang-Ching Wang, Melissa C. Smith
The resulting multi-stage speech enhancement system, in short, multi-stage SA-TCN, is compared with state-of-the-art deep-learning speech enhancement methods using the LibriSpeech and VCTK data sets.
Automatic Speech Recognition
Automatic Speech Recognition (ASR)
+2