no code implementations • 1 Dec 2021 • I-Fan Chen, Brian King, Jasha Droppo
In this paper, we propose an approach to quantitatively analyze impacts of different training label errors to RNN-T based ASR models.
no code implementations • 30 Jun 2020 • Maarten Van Segbroeck, Harish Mallidih, Brian King, I-Fan Chen, Gurpreet Chadha, Roland Maas
Acoustic models in real-time speech recognition systems typically stack multiple unidirectional LSTM layers to process the acoustic frames over time.
no code implementations • 24 Jan 2020 • Yang Chen, Weiran Wang, I-Fan Chen, Chao Wang
Practitioners often need to build ASR systems for new use cases in a short amount of time, given limited in-domain data.
no code implementations • 6 Feb 2019 • Yiming Wang, Xing Fan, I-Fan Chen, Yuzong Liu, Tongfei Chen, Björn Hoffmeister
The anchored segment refers to the wake-up word part of an audio stream, which contains valuable speaker information that can be used to suppress interfering speech and background noise.
no code implementations • 6 Mar 2015 • Zhen Huang, Sabato Marco Siniscalchi, I-Fan Chen, Jiadong Wu, Chin-Hui Lee
We present a Bayesian approach to adapting parameters of a well-trained context-dependent, deep-neural-network, hidden Markov model (CD-DNN-HMM) to improve automatic speech recognition performance.