no code implementations • 21 Aug 2024 • Prashant Serai, Peidong Wang, Eric Fosler-Lussier
In this work, we extend a prior phonetic confusion based model for predicting speech recognition errors in two ways: first, we introduce a sampling-based paradigm that better simulates the behavior of a posterior-based acoustic model.
no code implementations • 19 Jan 2024 • Prabhav Agrawal, Thilo Koehler, Zhiping Xiu, Prashant Serai, Qing He
A DSP vocoder often gets a lower audio quality due to consuming over-smoothed acoustic model predictions of approximate representations for the vocal tract.
no code implementations • 1 Mar 2023 • Philipp Klumpp, Pooja Chitkara, Leda Sari, Prashant Serai, JiLong Wu, Irina-Elena Veliche, Rongqing Huang, Qing He
In this work, we improve an accent-conversion model (ACM) which transforms native US-English speech into accented pronunciation.
no code implementations • 23 Nov 2022 • Mumin Jin, Prashant Serai, JiLong Wu, Andros Tjandra, Vimal Manohar, Qing He
Most people who have tried to learn a foreign language would have experienced difficulties understanding or speaking with a native speaker's accent.
no code implementations • 11 Apr 2022 • Vishal Sunder, Prashant Serai, Eric Fosler-Lussier
As it is difficult to collect spoken data from users without a functioning SLU system, our method does not rely on spoken data for training, rather we use an ASR error predictor to "speechify" the text data.
Automatic Speech Recognition Automatic Speech Recognition (ASR) +4
no code implementations • 23 Mar 2021 • Prashant Serai, Vishal Sunder, Eric Fosler-Lussier
Automatic Speech Recognition (ASR) is an imperfect process that results in certain mismatches in ASR output text when compared to plain written text or transcriptions.
Automatic Speech Recognition Automatic Speech Recognition (ASR) +4