no code implementations • 7 Nov 2024 • Subrina Sultana, Donald S. Williamson
Self-supervised learning (SSL) has grown in interest within the speech processing community, since it produces representations that are useful for many downstream tasks.
no code implementations • 24 Oct 2024 • Seyed Ali Alavi Bajestan, Mark Pitt, Donald S. Williamson
In this paper, we propose a method based on self supervised learning to minimize the difference between the latent representations of an attended speech signal and the corresponding EEG signal.
no code implementations • 17 Oct 2024 • Anurag Kumar, Andrew Perrault, Donald S. Williamson
Objective speech quality measures are typically used to assess speech enhancement algorithms, but it has been shown that they are sub-optimal as learning objectives because they do not always align well with human subjective ratings.
no code implementations • 16 Oct 2024 • Imran E Kibria, Donald S. Williamson
However, variance in ratings between listeners can introduce noise in the true quality label of an utterance.
no code implementations • 28 Apr 2023 • Yuchen Liu, Natasha Ong, Kaiyan Peng, Bo Xiong, Qifan Wang, Rui Hou, Madian Khabsa, Kaiyue Yang, David Liu, Donald S. Williamson, Hanchao Yu
Our model encodes different views of the input signal and builds several channel-resolution feature stages to process the multiple views of the input at different resolutions in parallel.
no code implementations • 23 Mar 2023 • Khandokar Md. Nayem, Donald S. Williamson
In this work, we propose an attention-based enhancement approach that uses learned speech embedding vectors from a mean-opinion score (MOS) prediction model and a speech enhancement module to jointly enhance noisy speech.
no code implementations • 9 Feb 2023 • Yuying Li, Yuchen Liu, Donald S. Williamson
More specifically, we develop a joint learning approach that uses a composite T60 module and a separate dereverberation module to simultaneously perform reverberation time estimation and dereverberation.
no code implementations • 24 Dec 2020 • Zhuohuang Zhang, Yong Xu, Meng Yu, Shi-Xiong Zhang, LianWu Chen, Donald S. Williamson, Dong Yu
Many purely neural network based speech separation approaches have been proposed to improve objective assessment scores, but they often introduce nonlinear distortions that are harmful to modern automatic speech recognition (ASR) systems.
Automatic Speech Recognition Automatic Speech Recognition (ASR) +2
no code implementations • 31 Jul 2020 • Xuan Dong, Donald S. Williamson
The real-world capabilities of objective speech quality measures are limited since current measures (1) are developed from simulated data that does not adequately model real environments; or they (2) predict objective scores that are not always strongly correlated with subjective ratings.
no code implementations • 29 Jul 2020 • Zhuohuang Zhang, Chengyun Deng, Yi Shen, Donald S. Williamson, Yongtao Sha, Yi Zhang, Hui Song, Xiangang Li
Recent work has shown that it is feasible to use generative adversarial networks (GANs) for speech enhancement, however, these approaches have not been compared to state-of-the-art (SOTA) non GAN-based approaches.
Audio and Speech Processing Sound