spatial-dccrn: dccrn equipped with frame-level angle feature and hybrid filtering for multi-channel speech enhancement

no code implementations17 Oct 2022 Shubo Lv, Yihui Fu, Yukai Jv, Lei Xie, Weixin Zhu, Wei Rao, Yannan Wang

Recently, multi-channel speech enhancement has drawn much interest due to the use of spatial information to distinguish target speech from interfering signal.

Denoising Speech Enhancement

S-DCCRN: Super Wide Band DCCRN with learnable complex feature for speech enhancement

no code implementations16 Nov 2021 Shubo Lv, Yihui Fu, Mengtao Xing, Jiayao Sun, Lei Xie, Jun Huang, Yannan Wang, Tao Yu

In speech enhancement, complex neural network has shown promising performance due to their effectiveness in processing complex-valued spectrum.

Denoising Speech Denoising +1

Improving Channel Decorrelation for Multi-Channel Target Speech Extraction

no code implementations6 Jun 2021 Jiangyu Han, Wei Rao, Yannan Wang, Yanhua Long

Moreover, new combination strategies of the CD-based spatial information and target speaker adaptation of parallel encoder outputs are also investigated.

Speech Extraction

A Two-Stage Approach to Device-Robust Acoustic Scene Classification

1 code implementation3 Nov 2020 Hu Hu, Chao-Han Huck Yang, Xianjun Xia, Xue Bai, Xin Tang, Yajian Wang, Shutong Niu, Li Chai, Juanjuan Li, Hongning Zhu, Feng Bao, Yuanjun Zhao, Sabato Marco Siniscalchi, Yannan Wang, Jun Du, Chin-Hui Lee

To improve device robustness, a highly desirable key feature of a competitive data-driven acoustic scene classification (ASC) system, a novel two-stage system based on fully convolutional neural networks (CNNs) is proposed.

Acoustic Scene Classification Classification +4

An Acoustic Segment Model Based Segment Unit Selection Approach to Acoustic Scene Classification with Partial Utterances

no code implementations31 Jul 2020 Hu Hu, Sabato Marco Siniscalchi, Yannan Wang, Xue Bai, Jun Du, Chin-Hui Lee

In contrast to building scene models with whole utterances, the ASM-removed sub-utterances, i. e., acoustic utterances without stop acoustic segments, are then used as inputs to the AlexNet-L back-end for final classification.

Acoustic Scene Classification Classification +5

Relational Teacher Student Learning with Neural Label Embedding for Device Adaptation in Acoustic Scene Classification

no code implementations31 Jul 2020 Hu Hu, Sabato Marco Siniscalchi, Yannan Wang, Chin-Hui Lee

In this paper, we propose a domain adaptation framework to address the device mismatch issue in acoustic scene classification leveraging upon neural label embedding (NLE) and relational teacher student learning (RTSL).

Acoustic Scene Classification Classification +3

Exploring Deep Hybrid Tensor-to-Vector Network Architectures for Regression Based Speech Enhancement

2 code implementations25 Jul 2020 Jun Qi, Hu Hu, Yannan Wang, Chao-Han Huck Yang, Sabato Marco Siniscalchi, Chin-Hui Lee

Finally, our experiments of multi-channel speech enhancement on a simulated noisy WSJ0 corpus demonstrate that our proposed hybrid CNN-TT architecture achieves better results than both DNN and CNN models in terms of better-enhanced speech qualities and smaller parameter sizes.

regression Speech Enhancement

Tensor-to-Vector Regression for Multi-channel Speech Enhancement based on Tensor-Train Network

2 code implementations3 Feb 2020 Jun Qi, Hu Hu, Yannan Wang, Chao-Han Huck Yang, Sabato Marco Siniscalchi, Chin-Hui Lee

Finally, in 8-channel conditions, a PESQ of 3. 12 is achieved using 20 million parameters for TTN, whereas a DNN with 68 million parameters can only attain a PESQ of 3. 06.

regression Speech Enhancement

