no code implementations • 25 Jan 2024 • Heming Wang, Eric W. Healy, DeLiang Wang
Specifically, we employ a diffusion-based model that is conditioned on the output of a predictive model.
1 code implementation • 5 Dec 2023 • Yixuan Zhang, Heming Wang, DeLiang Wang
Accurately detecting voiced intervals in speech signals is a critical step in pitch tracking and has numerous applications.
no code implementations • 15 Nov 2023 • Hassan Taherian, DeLiang Wang
To enhance ASR performance in conversational or meeting environments, continuous speaker separation (CSS) is commonly employed.
Automatic Speech Recognition Automatic Speech Recognition (ASR) +2
no code implementations • 29 Jan 2023 • Yixuan Zhang, Meng Yu, Hao Zhang, Dong Yu, DeLiang Wang
The robustness of the Kalman filter to double talk and its rapid convergence make it a popular approach for addressing acoustic echo cancellation (AEC) challenges.
no code implementations • 16 Jan 2023 • Hassan Taherian, DeLiang Wang
The performance of automatic speech recognition (ASR) systems severely degrades when multi-talker speech overlap occurs.
Automatic Speech Recognition Automatic Speech Recognition (ASR) +3
no code implementations • 24 Oct 2022 • Yufeng Yang, Ashutosh Pandey, DeLiang Wang
However, speech enhancement has not been established as an effective frontend for robust automatic speech recognition (ASR) in noisy conditions compared to an ASR model trained on noisy speech directly.
Automatic Speech Recognition Automatic Speech Recognition (ASR) +2
1 code implementation • 12 Apr 2022 • Haohe Liu, Xubo Liu, Qiuqiang Kong, Qiao Tian, Yan Zhao, DeLiang Wang, Chuanzeng Huang, Yuxuan Wang
Speech restoration aims to remove distortions in speech signals.
1 code implementation • 28 Mar 2022 • Haohe Liu, Woosung Choi, Xubo Liu, Qiuqiang Kong, Qiao Tian, DeLiang Wang
In this paper, we propose a neural vocoder based speech super-resolution method (NVSR) that can handle a variety of input resolution and upsampling ratios.
Ranked #2 on Audio Super-Resolution on VCTK Multi-Speaker
no code implementations • 1 Mar 2022 • Yufeng Yang, Peidong Wang, DeLiang Wang
The proposed model builds on the wide residual bi-directional long short-term memory network (WRBN) with utterance-wise dropout and iterative speaker adaptation, but employs a Conformer encoder instead of the recurrent network.
Automatic Speech Recognition Automatic Speech Recognition (ASR) +1
no code implementations • 31 Oct 2021 • DeLiang Wang, Yu Lu, Qinggang Meng, Penghe Chen
With more deep learning techniques being introduced into the knowledge tracing domain, the interpretability issue of the knowledge tracing models has aroused researchers' attention.
no code implementations • 28 Oct 2021 • Heming Wang, Yao Qian, Xiaofei Wang, Yiming Wang, Chengyi Wang, Shujie Liu, Takuya Yoshioka, Jinyu Li, DeLiang Wang
The reconstruction module is used for auxiliary learning to improve the noise robustness of the learned representation and thus is not required during inference.
Automatic Speech Recognition Automatic Speech Recognition (ASR) +8
no code implementations • 8 Oct 2021 • Hassan Taherian, Ke Tan, DeLiang Wang
We further demonstrate the effectiveness of LBT for the separation of four and five concurrent speakers.
no code implementations • 3 Mar 2021 • Hao Zhang, DeLiang Wang
Building on the deep learning based acoustic echo cancellation (AEC) in the single-loudspeaker (single-channel) and single-microphone setup, this paper investigates multi-channel AEC (MCAEC) and multi-microphone AEC (MMAEC).
no code implementations • 9 Nov 2020 • Peidong Wang, DeLiang Wang
On-device end-to-end speech recognition poses a high requirement on model efficiency.
no code implementations • 20 Oct 2020 • Peidong Wang, Zhuo Chen, DeLiang Wang, Jinyu Li, Yifan Gong
We propose speaker separation using speaker inventories and estimated speech (SSUSIES), a framework leveraging speaker profiles and estimated speech for speaker separation.
2 code implementations • 4 Oct 2020 • Zhong-Qiu Wang, Peidong Wang, DeLiang Wang
Although our system is trained on simulated room impulse responses (RIR) based on a fixed number of microphones arranged in a given geometry, it generalizes well to a real array with the same geometry.
no code implementations • 3 Sep 2020 • Ashutosh Pandey, DeLiang Wang
Even though the proposed loss is based on magnitudes only, a constraint imposed by noise prediction ensures that the loss enhances both magnitude and phase.
no code implementations • 13 May 2020 • Yu Lu, DeLiang Wang, Qinggang Meng, Penghe Chen
We thus propose to adopt the post-hoc method to tackle the interpretability issue for deep learning based knowledge tracing (DLKT) models.
1 code implementation • 25 Apr 2019 • Yuzhou Liu, DeLiang Wang
Simultaneous grouping is first performed in each time frame by separating the spectra of different speakers with a permutation-invariantly trained neural network.
Ranked #21 on Speech Separation on WSJ0-2mix
no code implementations • 11 Mar 2019 • Peidong Wang, Ke Tan, DeLiang Wang
In this study, we analyze the distortion problem, compare different acoustic models, and investigate a distortion-independent training scheme for monaural speech recognition.
Automatic Speech Recognition Automatic Speech Recognition (ASR) +2
no code implementations • 22 Nov 2018 • Zhong-Qiu Wang, Ke Tan, DeLiang Wang
This study investigates phase reconstruction for deep learning based monaural talker-independent speaker separation in the short-time Fourier transform (STFT) domain.
no code implementations • 26 Apr 2018 • Zhong-Qiu Wang, Jonathan Le Roux, DeLiang Wang, John R. Hershey
In addition, we train through unfolded iterations of a phase reconstruction algorithm, represented as a series of STFT and inverse STFT layers.
no code implementations • 24 Aug 2017 • DeLiang Wang, Jitong Chen
A more recent approach formulates speech separation as a supervised learning problem, where the discriminative patterns of speech, speakers, and background noise are learned from training data.
no code implementations • 14 Dec 2016 • Peidong Wang, DeLiang Wang
This paper proposed a class of novel Deep Recurrent Neural Networks which can incorporate language-level information into acoustic models.
Automatic Speech Recognition Automatic Speech Recognition (ASR) +1
no code implementations • 14 Dec 2016 • Peidong Wang, Zhongqiu Wang, DeLiang Wang
This paper presented our work on applying Recurrent Deep Stacking Networks (RDSNs) to Robust Automatic Speech Recognition (ASR) tasks.
Automatic Speech Recognition Automatic Speech Recognition (ASR) +1
no code implementations • NeurIPS 2012 • Yuxuan Wang, DeLiang Wang
While human listeners excel at selectively attending to a conversation in a cocktail party, machine performance is still far inferior by comparison.