Speech Enhancement

104 papers with code • 9 benchmarks • 13 datasets

Speech enhancement is the task of taking a noisy speech input and producing an enhanced speech output.

( Image credit: A Fully Convolutional Neural Network For Speech Enhancement )

Greatest papers with code

Spleeter: A Fast And State-of-the Art Music Source Separation Tool With Pre-trained Models

deezer/spleeter ISMIR 2019 Late-Breaking/Demo 2019

We present and release a new tool for music source separation with pre-trained models called Spleeter. Spleeter was designed with ease of use, separation performance and speed in mind.

Ranked #10 on Music Source Separation on MUSDB18 (using extra training data)

Music Source Separation Speech Enhancement

Perceptual Losses for Real-Time Style Transfer and Super-Resolution

alexjc/neural-enhance 27 Mar 2016

We consider image transformation problems, where an input image is transformed into an output image.

Image Super-Resolution Nuclear Segmentation +2

Towards a Competitive End-to-End Speech Recognition for CHiME-6 Dinner Party Transcription

espnet/espnet 22 Apr 2020

To demonstrate this, we use the CHiME-6 Challenge data as an example of challenging environments and noisy conditions of everyday speech.

Data Augmentation End-To-End Speech Recognition +2

MetricGAN-U: Unsupervised speech enhancement/ dereverberation based only on noisy/ reverberated speech

speechbrain/speechbrain 12 Oct 2021

Most of the deep learning-based speech enhancement models are learned in a supervised manner, which implies that pairs of noisy and clean speech are required during training.

Speech Enhancement Speech Quality

MetricGAN+: An Improved Version of MetricGAN for Speech Enhancement

speechbrain/speechbrain 8 Apr 2021

The discrepancy between the cost function used for training a speech enhancement model and human auditory perception usually makes the quality of enhanced speech unsatisfactory.

Speech Enhancement

HiFi-GAN: High-Fidelity Denoising and Dereverberation Based on Speech Deep Features in Adversarial Networks

coqui-ai/TTS 10 Jun 2020

Real-world audio recordings are often degraded by factors such as noise, reverberation, and equalization distortion.

Denoising Speech Dereverberation

Conv-TasNet: Surpassing Ideal Time-Frequency Magnitude Masking for Speech Separation

facebookresearch/demucs 20 Sep 2018

The majority of the previous methods have formulated the separation problem through the time-frequency representation of the mixed signal, which has several drawbacks, including the decoupling of the phase and magnitude of the signal, the suboptimality of time-frequency representation for speech separation, and the long latency in calculating the spectrograms.

Multi-task Audio Source Seperation Music Source Separation +3

DCCRN: Deep Complex Convolution Recurrent Network for Phase-Aware Speech Enhancement

mpariente/asteroid Interspeech 2020

Speech enhancement has benefited from the success of deep learning in terms of intelligibility and perceptual quality.

Speech Enhancement Audio and Speech Processing Sound

Real Time Speech Enhancement in the Waveform Domain

facebookresearch/denoiser 23 Jun 2020

The proposed model matches state-of-the-art performance of both causal and non causal methods while working directly on the raw waveform.

Data Augmentation Speech Enhancement

SEGAN: Speech Enhancement Generative Adversarial Network

facebookresearch/denoiser 28 Mar 2017

In contrast to current techniques, we operate at the waveform level, training the model end-to-end, and incorporate 28 speakers and 40 different noise conditions into the same model, such that model parameters are shared across them.

Speech Enhancement