1 code implementation • 19 Jul 2022 • Yen-Ju Lu, Xuankai Chang, Chenda Li, Wangyou Zhang, Samuele Cornell, Zhaoheng Ni, Yoshiki Masuyama, Brian Yan, Robin Scheibler, Zhong-Qiu Wang, Yu Tsao, Yanmin Qian, Shinji Watanabe
To showcase such integration, we performed experiments on carefully designed synthetic datasets for noisy-reverberant multi-channel ST and SLU tasks, which can be used as benchmark corpora for future research.
no code implementations • 8 Mar 2022 • Olga Slizovskaia, Gordon Wichern, Zhong-Qiu Wang, Jonathan Le Roux
Existing systems for sound event localization and detection (SELD) typically operate by estimating a source location for all classes at every time instant.
no code implementations • 24 Feb 2022 • Yen-Ju Lu, Samuele Cornell, Xuankai Chang, Wangyou Zhang, Chenda Li, Zhaoheng Ni, Zhong-Qiu Wang, Shinji Watanabe
This paper describes our submission to the L3DAS22 Challenge Task 1, which consists of speech enhancement with 3D Ambisonic microphones.
1 code implementation • 10 Feb 2022 • Yen-Ju Lu, Zhong-Qiu Wang, Shinji Watanabe, Alexander Richard, Cheng Yu, Yu Tsao
Speech enhancement is a critical component of many user-oriented audio applications, yet current systems still suffer from distorted and unnatural outputs.
2 code implementations • 19 Oct 2021 • Darius Petermann, Gordon Wichern, Zhong-Qiu Wang, Jonathan Le Roux
The cocktail party problem aims at isolating any source of interest within a complex acoustic scene, and has long inspired audio source separation research.
2 code implementations • 4 Oct 2020 • Zhong-Qiu Wang, Peidong Wang, DeLiang Wang
Although our system is trained on simulated room impulse responses (RIR) based on a fixed number of microphones arranged in a given geometry, it generalizes well to a real array with the same geometry.
no code implementations • 18 Nov 2019 • Zhong-Qiu Wang, Hakan Erdogan, Scott Wisdom, Kevin Wilson, Desh Raj, Shinji Watanabe, Zhuo Chen, John R. Hershey
This work introduces sequential neural beamforming, which alternates between neural network based spectral separation and beamforming based spatial separation.
no code implementations • 22 Nov 2018 • Zhong-Qiu Wang, Ke Tan, DeLiang Wang
This study investigates phase reconstruction for deep learning based monaural talker-independent speaker separation in the short-time Fourier transform (STFT) domain.
no code implementations • 26 Apr 2018 • Zhong-Qiu Wang, Jonathan Le Roux, DeLiang Wang, John R. Hershey
In addition, we train through unfolded iterations of a phase reconstruction algorithm, represented as a series of STFT and inverse STFT layers.