Speech enhancement is the task of taking a noisy speech input and producing an enhanced speech output.

Spleeter: A Fast And State-of-the Art Music Source Separation Tool With Pre-trained Models

deezer/spleeter ISMIR 2019 Late-Breaking/Demo 2019

We present and release a new tool for music source separation with pre-trained models called Spleeter. Spleeter was designed with ease of use, separation performance and speed in mind.

Ranked #10 on Music Source Separation on MUSDB18 (using extra training data)

Perceptual Losses for Real-Time Style Transfer and Super-Resolution

alexjc/neural-enhance 27 Mar 2016

We consider image transformation problems, where an input image is transformed into an output image.

Towards a Competitive End-to-End Speech Recognition for CHiME-6 Dinner Party Transcription

espnet/espnet 22 Apr 2020

To demonstrate this, we use the CHiME-6 Challenge data as an example of challenging environments and noisy conditions of everyday speech.

MetricGAN-U: Unsupervised speech enhancement/ dereverberation based only on noisy/ reverberated speech

speechbrain/speechbrain 12 Oct 2021

Most of the deep learning-based speech enhancement models are learned in a supervised manner, which implies that pairs of noisy and clean speech are required during training.

MetricGAN+: An Improved Version of MetricGAN for Speech Enhancement

speechbrain/speechbrain 8 Apr 2021

The discrepancy between the cost function used for training a speech enhancement model and human auditory perception usually makes the quality of enhanced speech unsatisfactory.

HiFi-GAN: High-Fidelity Denoising and Dereverberation Based on Speech Deep Features in Adversarial Networks

coqui-ai/TTS 10 Jun 2020

Real-world audio recordings are often degraded by factors such as noise, reverberation, and equalization distortion.

Conv-TasNet: Surpassing Ideal Time-Frequency Magnitude Masking for Speech Separation

facebookresearch/demucs 20 Sep 2018

The majority of the previous methods have formulated the separation problem through the time-frequency representation of the mixed signal, which has several drawbacks, including the decoupling of the phase and magnitude of the signal, the suboptimality of time-frequency representation for speech separation, and the long latency in calculating the spectrograms.

DCCRN: Deep Complex Convolution Recurrent Network for Phase-Aware Speech Enhancement

mpariente/asteroid Interspeech 2020

Speech enhancement has benefited from the success of deep learning in terms of intelligibility and perceptual quality.

Real Time Speech Enhancement in the Waveform Domain

facebookresearch/denoiser 23 Jun 2020

The proposed model matches state-of-the-art performance of both causal and non causal methods while working directly on the raw waveform.

SEGAN: Speech Enhancement Generative Adversarial Network

facebookresearch/denoiser 28 Mar 2017

In contrast to current techniques, we operate at the waveform level, training the model end-to-end, and incorporate 28 speakers and 40 different noise conditions into the same model, such that model parameters are shared across them.

