no code implementations • 28 Apr 2023 • Yuchen Liu, Natasha Ong, Kaiyan Peng, Bo Xiong, Qifan Wang, Rui Hou, Madian Khabsa, Kaiyue Yang, David Liu, Donald S. Williamson, Hanchao Yu
Our model encodes different views of the input signal and builds several channel-resolution feature stages to process the multiple views of the input at different resolutions in parallel.
no code implementations • 23 Mar 2023 • Khandokar Md. Nayem, Donald S. Williamson
In this work, we propose an attention-based enhancement approach that uses learned speech embedding vectors from a mean-opinion score (MOS) prediction model and a speech enhancement module to jointly enhance noisy speech.
no code implementations • 9 Feb 2023 • Yuying Li, Yuchen Liu, Donald S. Williamson
More specifically, we develop a joint learning approach that uses a composite T60 module and a separate dereverberation module to simultaneously perform reverberation time estimation and dereverberation.
no code implementations • 24 Dec 2020 • Zhuohuang Zhang, Yong Xu, Meng Yu, Shi-Xiong Zhang, LianWu Chen, Donald S. Williamson, Dong Yu
Many purely neural network based speech separation approaches have been proposed to improve objective assessment scores, but they often introduce nonlinear distortions that are harmful to modern automatic speech recognition (ASR) systems.
Automatic Speech Recognition Automatic Speech Recognition (ASR) +2
no code implementations • 31 Jul 2020 • Xuan Dong, Donald S. Williamson
The real-world capabilities of objective speech quality measures are limited since current measures (1) are developed from simulated data that does not adequately model real environments; or they (2) predict objective scores that are not always strongly correlated with subjective ratings.
no code implementations • 29 Jul 2020 • Zhuohuang Zhang, Chengyun Deng, Yi Shen, Donald S. Williamson, Yongtao Sha, Yi Zhang, Hui Song, Xiangang Li
Recent work has shown that it is feasible to use generative adversarial networks (GANs) for speech enhancement, however, these approaches have not been compared to state-of-the-art (SOTA) non GAN-based approaches.
Audio and Speech Processing Sound