Speech Enhancement

218 papers with code • 12 benchmarks • 19 datasets

Speech Enhancement is a signal processing task that involves improving the quality of speech signals captured under noisy or degraded conditions. The goal of speech enhancement is to make speech signals clearer, more intelligible, and more pleasant to listen to, which can be used for various applications such as voice recognition, teleconferencing, and hearing aids.

( Image credit: A Fully Convolutional Neural Network For Speech Enhancement )

Benchmarks

Add a Result

These leaderboards are used to track progress in Speech Enhancement

Dataset	Best Model	Compare
VoiceBank + DEMAND	MP-SENet	See all
Deep Noise Suppression (DNS) Challenge	MP-SENet	See all
CHiME-3	Inter-Channel Conv-TasNet	See all
EasyCom	MaxDI (Baseline)	See all
DNS Challenge	DCUnet-MC	See all
WHAMR!	SepFormer	See all
WSJ0 + DEMAND + RNNoise	DCUNet-MC	See all
GRID corpus (mixed-speech)	Audio-Visual concat-ref	See all
TCD-TIMIT corpus (mixed-speech)	Audio-Visual concat-ref	See all
LibriSpeechDuplicate	SE-MelGAN	See all
WHAM!	SepFormer	See all
spatialized DNS challenge	DeFT-AN	See all

Show all 12 benchmarks

Collapse benchmarks

Libraries

Use these libraries to find Speech Enhancement models and implementations

rikorose/deepfilternet

4 papers

1,922

microsoft/DNS-Challenge

4 papers

973

anicolson/DeepXi

4 papers

485

espnet/espnet

3 papers

7,880

See all 10 libraries.

Datasets

Subtasks

Most implemented papers

Most implemented Social Latest No code

Fast Multichannel Source Separation Based on Jointly Diagonalizable Spatial Covariance Matrices

sekiguchi92/SpeechEnhancement • • European Association for Signal Processing (EUSIPCO) 2019

A popular approach to multichannel source separation is to integrate a spatial model with a source model for estimating the spatial covariance matrices (SCMs) and power spectral densities (PSDs) of each sound source in the time-frequency domain.

Paper
Code

RHR-Net: A Residual Hourglass Recurrent Neural Network for Speech Enhancement

CODEJIN/RHRNet • • 15 Apr 2019

Most current speech enhancement models use spectrogram features that require an expensive transformation and result in phase information loss.

Paper
Code

Word-level Embeddings for Cross-Task Transfer Learning in Speech Processing

bepierre/SpeechVGG • • 22 Oct 2019

Recent breakthroughs in deep learning often rely on representation learning and knowledge transfer.

Paper
Code

Improving GANs for Speech Enhancement

pquochuy/idsegan • • 15 Jan 2020

The former constrains the generators to learn a common mapping that is iteratively applied at all enhancement stages and results in a small model footprint.

Paper
Code

Tensor-to-Vector Regression for Multi-channel Speech Enhancement based on Tensor-Train Network

uwjunqi/Tensor-Train-Neural-Network • • 3 Feb 2020

Finally, in 8-channel conditions, a PESQ of 3. 12 is achieved using 20 million parameters for TTN, whereas a DNN with 68 million parameters can only attain a PESQ of 3. 06.

Paper
Code

Weighted Speech Distortion Losses for Neural-network-based Real-time Speech Enhancement

microsoft/DNS-Challenge • IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2020

This paper investigates several aspects of training a RNN (recurrent neural network) that impact the objective and subjective quality of enhanced speech for real-time single-channel speech enhancement.

Paper
Code

Deep Residual-Dense Lattice Network for Speech Enhancement

nick-nikzad/RDL-SE • • 27 Feb 2020

Motivated by this, we propose the residual-dense lattice network (RDL-Net), which is a new CNN for speech enhancement that employs both residual and dense aggregations without over-allocating parameters for feature re-usage.

Paper
Code

Dual-Signal Transformation LSTM Network for Real-Time Noise Suppression

breizhn/DTLN • • Interspeech 2020

This paper introduces a dual-signal transformation LSTM network (DTLN) for real-time speech enhancement as part of the Deep Noise Suppression Challenge (DNS-Challenge).

Paper
Code

Exploring Deep Hybrid Tensor-to-Vector Network Architectures for Regression Based Speech Enhancement

uwjunqi/Tensor-Train-Neural-Network • • 25 Jul 2020

Finally, our experiments of multi-channel speech enhancement on a simulated noisy WSJ0 corpus demonstrate that our proposed hybrid CNN-TT architecture achieves better results than both DNN and CNN models in terms of better-enhanced speech qualities and smaller parameter sizes.

Paper
Code

A Perceptually-Motivated Approach for Low-Complexity, Real-Time Enhancement of Fullband Speech

jzi040941/PercepNet • • Interspeech 2020

Over the past few years, speech enhancement methods based on deep learning have greatly surpassed traditional methods based on spectral subtraction and spectral estimation.

Paper
Code

Speech Enhancement

Benchmarks Add a Result

Libraries

Datasets

Subtasks

Most implemented papers

Content

Benchmarks

Add a Result