no code implementations • 13 Apr 2022 • Chen Chen, Yuchen Hu, Nana Hou, Xiaofeng Qi, Heqing Zou, Eng Siong Chng
Although automatic speech recognition (ASR) task has gained remarkable success by sequence-to-sequence models, there are two main mismatches between its training and testing that might lead to performance degradation: 1) The typically used cross-entropy criterion aims to maximize log-likelihood of the training data, while the performance is evaluated by word error rate (WER), not log-likelihood; 2) The teacher-forcing method leads to the dependence on ground truth during training, which means that model has never been exposed to its own prediction before testing.
Automatic Speech Recognition Automatic Speech Recognition (ASR) +2
1 code implementation • 30 Mar 2022 • Yang Xiao, Nana Hou, Eng Siong Chng
Catastrophic forgetting is a thorny challenge when updating keyword spotting (KWS) models after deployment.
no code implementations • 29 Mar 2022 • Chen Chen, Nana Hou, Yuchen Hu, Heqing Zou, Xiaofeng Qi, Eng Siong Chng
Automated Audio captioning (AAC) is a cross-modal task that generates natural language to describe the content of input audio.
no code implementations • 29 Mar 2022 • Chen Chen, Nana Hou, Yuchen Hu, Shashank Shirol, Eng Siong Chng
Noise-robust speech recognition systems require large amounts of training data including noisy speech data and corresponding transcripts to achieve state-of-the-art performances in face of various practical environments.
1 code implementation • 28 Mar 2022 • Yuchen Hu, Nana Hou, Chen Chen, Eng Siong Chng
Then, we propose style learning to map the fused feature close to clean feature, in order to learn latent speech information from the latter, i. e., clean "speech style".
Automatic Speech Recognition Automatic Speech Recognition (ASR) +3
2 code implementations • 29 Jan 2022 • Yizheng Huang, Nana Hou, Nancy F. Chen
Catastrophic forgetting is a thorny challenge when updating keyword spotting (KWS) models after deployment.
2 code implementations • 11 Oct 2021 • Yuchen Hu, Nana Hou, Chen Chen, Eng Siong Chng
Speech enhancement (SE) aims to suppress the additive noise from a noisy speech signal to improve the speech's perceptual quality and intelligibility.
Automatic Speech Recognition Automatic Speech Recognition (ASR) +3
no code implementations • 22 Jul 2021 • Duo Ma, Nana Hou, Van Tung Pham, HaiHua Xu, Eng Siong Chng
One of the advantage of the proposed method is that the entire system can be trained from scratch.
Automatic Speech Recognition Automatic Speech Recognition (ASR) +3