Speech Enhancement

218 papers with code • 12 benchmarks • 19 datasets

Speech Enhancement is a signal processing task that involves improving the quality of speech signals captured under noisy or degraded conditions. The goal of speech enhancement is to make speech signals clearer, more intelligible, and more pleasant to listen to, which can be used for various applications such as voice recognition, teleconferencing, and hearing aids.

( Image credit: A Fully Convolutional Neural Network For Speech Enhancement )

Libraries

Use these libraries to find Speech Enhancement models and implementations
4 papers
485
3 papers
7,880
See all 10 libraries.

Most implemented papers

Fast Multichannel Source Separation Based on Jointly Diagonalizable Spatial Covariance Matrices

sekiguchi92/SpeechEnhancement European Association for Signal Processing (EUSIPCO) 2019

A popular approach to multichannel source separation is to integrate a spatial model with a source model for estimating the spatial covariance matrices (SCMs) and power spectral densities (PSDs) of each sound source in the time-frequency domain.

RHR-Net: A Residual Hourglass Recurrent Neural Network for Speech Enhancement

CODEJIN/RHRNet 15 Apr 2019

Most current speech enhancement models use spectrogram features that require an expensive transformation and result in phase information loss.

Word-level Embeddings for Cross-Task Transfer Learning in Speech Processing

bepierre/SpeechVGG 22 Oct 2019

Recent breakthroughs in deep learning often rely on representation learning and knowledge transfer.

Improving GANs for Speech Enhancement

pquochuy/idsegan 15 Jan 2020

The former constrains the generators to learn a common mapping that is iteratively applied at all enhancement stages and results in a small model footprint.

Tensor-to-Vector Regression for Multi-channel Speech Enhancement based on Tensor-Train Network

uwjunqi/Tensor-Train-Neural-Network 3 Feb 2020

Finally, in 8-channel conditions, a PESQ of 3. 12 is achieved using 20 million parameters for TTN, whereas a DNN with 68 million parameters can only attain a PESQ of 3. 06.

Weighted Speech Distortion Losses for Neural-network-based Real-time Speech Enhancement

microsoft/DNS-Challenge IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2020

This paper investigates several aspects of training a RNN (recurrent neural network) that impact the objective and subjective quality of enhanced speech for real-time single-channel speech enhancement.

Deep Residual-Dense Lattice Network for Speech Enhancement

nick-nikzad/RDL-SE 27 Feb 2020

Motivated by this, we propose the residual-dense lattice network (RDL-Net), which is a new CNN for speech enhancement that employs both residual and dense aggregations without over-allocating parameters for feature re-usage.

Dual-Signal Transformation LSTM Network for Real-Time Noise Suppression

breizhn/DTLN Interspeech 2020

This paper introduces a dual-signal transformation LSTM network (DTLN) for real-time speech enhancement as part of the Deep Noise Suppression Challenge (DNS-Challenge).

Exploring Deep Hybrid Tensor-to-Vector Network Architectures for Regression Based Speech Enhancement

uwjunqi/Tensor-Train-Neural-Network 25 Jul 2020

Finally, our experiments of multi-channel speech enhancement on a simulated noisy WSJ0 corpus demonstrate that our proposed hybrid CNN-TT architecture achieves better results than both DNN and CNN models in terms of better-enhanced speech qualities and smaller parameter sizes.

A Perceptually-Motivated Approach for Low-Complexity, Real-Time Enhancement of Fullband Speech

jzi040941/PercepNet Interspeech 2020

Over the past few years, speech enhancement methods based on deep learning have greatly surpassed traditional methods based on spectral subtraction and spectral estimation.