Paper tables with annotated results for Monaural Speech Enhancement with Complex Convolutional Block Attention Module and Joint Time Frequency Losses

Paper

Monaural Speech Enhancement with Complex Convolutional Block Attention Module and Joint Time Frequency Losses

Deep complex U-Net structure and convolutional recurrent network (CRN) structure achieve state-of-the-art performance for monaural speech enhancement. Both deep complex U-Net and CRN are encoder and decoder structures with skip connections, which heavily rely on the representation power of the complex-valued convolutional layers. In this paper, we propose a complex convolutional block attention module (CCBAM) to boost the representation power of the complex-valued convolutional layers by constructing more informative features. The CCBAM is a lightweight and general module which can be easily integrated into any complex-valued convolutional layers. We integrate CCBAM with the deep complex U-Net and CRN to enhance their performance for speech enhancement. We further propose a mixed loss function to jointly optimize the complex models in both time-frequency (TF) domain and time domain. By integrating CCBAM and the mixed loss, we form a new end-to-end (E2E) complex speech enhancement framework. Ablation experiments and objective evaluations show the superior performance of the proposed approaches.

PDF Paper record

Results in Papers With Code

(↓ scroll down to see all results)

Monaural Speech Enhancement with Complex Convolutional Block Attention Module and Joint Time Frequency Losses

Reader Guidelines

Editor Guidelines