Speech Denoising
32 papers with code • 2 benchmarks • 3 datasets
Obtain the clean speech of the target speaker by suppressing the background noise.
Most implemented papers
WavLM: Large-Scale Self-Supervised Pre-Training for Full Stack Speech Processing
Self-supervised learning (SSL) achieves great success in speech recognition, while limited exploration has been attempted for other speech processing tasks.
Speech Denoising with Deep Feature Losses
We present an end-to-end deep learning approach to denoising speech signals by processing the raw waveform directly.
WaveCRN: An Efficient Convolutional Recurrent Neural Network for End-to-end Speech Enhancement
In WaveCRN, the speech locality feature is captured by a convolutional neural network (CNN), while the temporal sequential property of the locality feature is modeled by stacked simple recurrent units (SRU).
Joint Optimization of Masks and Deep Recurrent Neural Networks for Monaural Source Separation
In this paper, we explore joint optimization of masking functions and deep recurrent neural networks for monaural source separation tasks, including monaural speech separation, monaural singing voice separation, and speech denoising.
Speech Denoising Convolutional Neural Network trained with Deep Feature Losses.
We present an end-to-end deep learning approach to denoising speech signals by processing the raw waveform directly.
Monaural Speech Enhancement with Complex Convolutional Block Attention Module and Joint Time Frequency Losses
In this paper, we propose a complex convolutional block attention module (CCBAM) to boost the representation power of the complex-valued convolutional layers by constructing more informative features.
Speech Denoising Without Clean Training Data: A Noise2Noise Approach
This paper tackles the problem of the heavy dependence of clean speech data required by deep learning based audio-denoising methods by showing that it is possible to train deep speech denoising networks using only noisy speech samples.
FRA-RIR: Fast Random Approximation of the Image-source Method
The training of modern speech processing systems often requires a large amount of simulated room impulse response (RIR) data in order to allow the systems to generalize well in real-world, reverberant environments.
CMGAN: Conformer-Based Metric-GAN for Monaural Speech Enhancement
Rather than focusing exclusively on the speech denoising task, we extend this work to address the dereverberation and super-resolution tasks.
Supervised and Unsupervised Speech Enhancement Using Nonnegative Matrix Factorization
We propose a novel speech enhancement method that is based on a Bayesian formulation of NMF (BNMF).