Speech Enhancement

157 papers with code • 11 benchmarks • 16 datasets

Speech enhancement is the task of taking a noisy speech input and producing an enhanced speech output.

( Image credit: A Fully Convolutional Neural Network For Speech Enhancement )


Use these libraries to find Speech Enhancement models and implementations

Most implemented papers

Proximal Policy Optimization Algorithms

ray-project/ray 20 Jul 2017

We propose a new family of policy gradient methods for reinforcement learning, which alternate between sampling data through interaction with the environment, and optimizing a "surrogate" objective function using stochastic gradient ascent.

Perceptual Losses for Real-Time Style Transfer and Super-Resolution

alexjc/neural-enhance 27 Mar 2016

We consider image transformation problems, where an input image is transformed into an output image.

SEGAN: Speech Enhancement Generative Adversarial Network

santi-pdp/segan 28 Mar 2017

In contrast to current techniques, we operate at the waveform level, training the model end-to-end, and incorporate 28 speakers and 40 different noise conditions into the same model, such that model parameters are shared across them.

Conv-TasNet: Surpassing Ideal Time-Frequency Magnitude Masking for Speech Separation

naplab/Conv-TasNet 20 Sep 2018

The majority of the previous methods have formulated the separation problem through the time-frequency representation of the mixed signal, which has several drawbacks, including the decoupling of the phase and magnitude of the signal, the suboptimality of time-frequency representation for speech separation, and the long latency in calculating the spectrograms.

Phase-aware Speech Enhancement with Deep Complex U-Net

AppleHolic/source_separation ICLR 2019

Most deep learning-based models for speech enhancement have mainly focused on estimating the magnitude of spectrogram while reusing the phase from noisy speech for reconstruction.

DCCRN: Deep Complex Convolution Recurrent Network for Phase-Aware Speech Enhancement

huyanxin/DeepComplexCRN Interspeech 2020

Speech enhancement has benefited from the success of deep learning in terms of intelligibility and perceptual quality.

A Fully Convolutional Neural Network for Speech Enhancement

zhr1201/CNN-for-single-channel-speech-enhancement 22 Sep 2016

In hearing aids, the presence of babble noise degrades hearing intelligibility of human speech greatly.

MetricGAN: Generative Adversarial Networks based Black-box Metric Scores Optimization for Speech Enhancement

JasonSWFu/MetricGAN 13 May 2019

Adversarial loss in a conditional generative adversarial network (GAN) is not designed to directly optimize evaluation metrics of a target task, and thus, may not always guide the generator in a GAN to generate data with improved metric scores.

VoiceFilter: Targeted Voice Separation by Speaker-Conditioned Spectrogram Masking

Edresson/VoiceSplit 11 Oct 2018

In this paper, we present a novel system that separates the voice of a target speaker from multi-speaker signals, by making use of a reference signal from the target speaker.

FullSubNet: A Full-Band and Sub-Band Fusion Model for Real-Time Single-Channel Speech Enhancement

haoxiangsnr/FullSubNet 29 Oct 2020

In our proposed FullSubNet, we connect a pure full-band model and a pure sub-band model sequentially and use practical joint training to integrate these two types of models' advantages.