no code implementations • 10 Dec 2022 • Rohith Aralikatti, Zhenyu Tang, Dinesh Manocha
We present a novel approach to improve the performance of learning-based speech dereverberation using accurate synthetic datasets.
2 code implementations • 18 May 2022 • Anton Ratnarajah, Zhenyu Tang, Rohith Chandrashekar Aralikatti, Dinesh Manocha
We show that the acoustic metrics of the IRs predicted from our MESH2IR match the ground truth with less than 10% error.
2 code implementations • 7 Oct 2021 • Anton Ratnarajah, Shi-Xiong Zhang, Meng Yu, Zhenyu Tang, Dinesh Manocha, Dong Yu
We present a neural-network-based fast diffuse room impulse response generator (FAST-RIR) for generating room impulse responses (RIRs) for a given acoustic environment.
Automatic Speech Recognition
Automatic Speech Recognition (ASR)
+1
no code implementations • 19 Jul 2021 • Rohith Aralikatti, Anton Ratnarajah, Zhenyu Tang, Dinesh Manocha
We present a novel approach that improves the performance of reverberant speech separation.
no code implementations • 25 Jun 2021 • Ori Kabeli, Yossi Adi, Zhenyu Tang, Buye Xu, Anurag Kumar
Our stateful implementation for online separation leads to a minor drop in performance compared to the offline model; 0. 8dB for monaural inputs and 0. 3dB for binaural inputs while reaching a real-time factor of 0. 65.
no code implementations • 21 Apr 2021 • Zhenyu Tang, Dinesh Manocha
We use a deep learning-based estimator to non-intrusively compute the sub-band reverberation time of an environment from its speech samples.
Automatic Speech Recognition
Automatic Speech Recognition (ASR)
+1
no code implementations • 1 Jan 2021 • Hsien-Yu Meng, Zhenyu Tang, Dinesh Manocha
Acoustic properties of objects corresponding to scattering characteristics are frequently used for 3D audio content creation, environmental acoustic effects, localization and acoustic scene analysis, etc.
no code implementations • 8 May 2020 • Kelei He, Wei Zhao, Xingzhi Xie, Wen Ji, Mingxia Liu, Zhenyu Tang, Feng Shi, Yang Gao, Jun Liu, Junfeng Zhang, Dinggang Shen
Considering that only a few infection regions in a CT image are related to the severity assessment, we first represent each input image by a bag that contains a set of 2D image patches (with each cropped from a specific slice).
1 code implementation • 6 Apr 2020 • Feng Shi, Jun Wang, Jun Shi, Ziyan Wu, Qian Wang, Zhenyu Tang, Kelei He, Yinghuan Shi, Dinggang Shen
In this review paper, we thus cover the entire pipeline of medical imaging and analysis techniques involved with COVID-19, including image acquisition, segmentation, diagnosis, and follow-up.
no code implementations • 26 Mar 2020 • Zhenyu Tang, Wei Zhao, Xingzhi Xie, Zheng Zhong, Feng Shi, Jun Liu, Dinggang Shen
Purpose: Using machine learning method to realize automatic severity assessment (non-severe or severe) of COVID-19 based on chest CT images, and to explore the severity-related features from the resulting assessment model.
no code implementations • 14 Nov 2019 • Zhenyu Tang, Nicholas J. Bryan, DIngzeyu Li, Timothy R. Langlois, Dinesh Manocha
We present a new method to capture the acoustic characteristics of real-world rooms using commodity devices, and use the captured characteristics to generate similar sounding sources with virtual models.
Sound Graphics Multimedia Audio and Speech Processing
no code implementations • 9 Jul 2019 • Zhenyu Tang, Lian-Wu Chen, Bo Wu, Dong Yu, Dinesh Manocha
We present an efficient and realistic geometric acoustic simulation approach for generating and augmenting training data in speech-related machine learning tasks.
1 code implementation • 17 Apr 2019 • Zhenyu Tang, John D. Kanu, Kevin Hogan, Dinesh Manocha
We present a novel learning-based approach to estimate the direction-of-arrival (DOA) of a sound source using a convolutional recurrent neural network (CRNN) trained via regression on synthetic data and Cartesian labels.
Ranked #1 on
Direction of Arrival Estimation
on SOFA
(using extra training data)