Speech Enhancement

217 papers with code • 12 benchmarks • 19 datasets

Speech Enhancement is a signal processing task that involves improving the quality of speech signals captured under noisy or degraded conditions. The goal of speech enhancement is to make speech signals clearer, more intelligible, and more pleasant to listen to, which can be used for various applications such as voice recognition, teleconferencing, and hearing aids.

( Image credit: A Fully Convolutional Neural Network For Speech Enhancement )

Benchmarks

Add a Result

These leaderboards are used to track progress in Speech Enhancement

Dataset	Best Model	Compare
VoiceBank + DEMAND	MP-SENet	See all
Deep Noise Suppression (DNS) Challenge	MP-SENet	See all
CHiME-3	Inter-Channel Conv-TasNet	See all
EasyCom	MaxDI (Baseline)	See all
DNS Challenge	DCUnet-MC	See all
WHAMR!	SepFormer	See all
WSJ0 + DEMAND + RNNoise	DCUNet-MC	See all
GRID corpus (mixed-speech)	Audio-Visual concat-ref	See all
TCD-TIMIT corpus (mixed-speech)	Audio-Visual concat-ref	See all
LibriSpeechDuplicate	SE-MelGAN	See all
WHAM!	SepFormer	See all
spatialized DNS challenge	DeFT-AN	See all

Show all 12 benchmarks

Collapse benchmarks

Libraries

Use these libraries to find Speech Enhancement models and implementations

rikorose/deepfilternet

4 papers

1,898

microsoft/DNS-Challenge

4 papers

966

anicolson/DeepXi

4 papers

485

espnet/espnet

3 papers

7,858

See all 10 libraries.

Datasets

Subtasks

Latest papers

Most implemented Social Latest No code

FSPEN: AN ULTRA-LIGHTWEIGHT NETWORK FOR REAL TIME SPEECH ENAHNCMENT

gitwukeyi/FSPEN • • Conference 2024

Deep learning-based speech enhancement methods have shown promising result in recent years.

15 Apr 2024

Paper
Code

How to train your ears: Auditory-model emulation for large-dynamic-range inputs and mild-to-severe hearing losses

p-leer/howtotrainyourears • 15 Mar 2024

Our results show that this new optimization objective significantly improves the emulation performance of deep neural networks across relevant input sound levels and auditory-model frequency channels, without increasing the computational load during inference.

15 Mar 2024

Paper
Code

Binaural Speech Enhancement Using Deep Complex Convolutional Transformer Networks

vikastokala/bcctn • • 8 Mar 2024

Studies have shown that in noisy acoustic environments, providing binaural signals to the user of an assistive listening device may improve speech intelligibility and spatial awareness.

08 Mar 2024

Paper
Code

Self-Supervised Speech Quality Estimation and Enhancement Using Only Clean Speech

JasonSWFu/VQscore • • 26 Feb 2024

To improve the robustness of the encoder for SE, a novel self-distillation mechanism combined with adversarial training is introduced.

26 Feb 2024

Paper
Code

Improving Design of Input Condition Invariant Speech Enhancement

espnet/espnet • • 25 Jan 2024

In this paper we propose novel architectures to improve the input condition invariant SE model so that performance in simulated conditions remains competitive while real condition degradation is much mitigated.

7,858

25 Jan 2024

Paper
Code

A Two-Stage Framework in Cross-Spectrum Domain for Real-Time Speech Enhancement

zhangyuewei98/fdfnet • 19 Jan 2024

Two-stage pipeline is popular in speech enhancement tasks due to its superiority over traditional single-stage methods.

19 Jan 2024

Paper
Code

A Refining Underlying Information Framework for Monaural Speech Enhancement

caoruitju/rui_se • • 18 Dec 2023

By bridging the speech enhancement and the Information Bottleneck principle in this letter, we rethink a universal plug-and-play strategy and propose a Refining Underlying Information framework called RUI to rise to the challenges both in theory and practice.

18 Dec 2023

Paper
Code

D4AM: A General Denoising Framework for Downstream Acoustic Models

changlee0903/d4am • • 28 Nov 2023

To our knowledge, this is the first work that deploys an effective combination scheme of regression (denoising) and classification (ASR) objectives to derive a general pre-processor applicable to various unseen ASR systems.

28 Nov 2023

Paper
Code

TorchAudio 2.1: Advancing speech recognition, self-supervised learning, and audio processing components for PyTorch

pytorch/audio • • 27 Oct 2023

TorchAudio is an open-source audio and speech processing library built for PyTorch.

2,379

27 Oct 2023

Paper
Code

LauraGPT: Listen, Attend, Understand, and Regenerate Audio with GPT

alibaba-damo-academy/funcodec • • 7 Oct 2023

In this paper, we propose LauraGPT, a unified GPT model for audio recognition, understanding, and generation.

273

07 Oct 2023

Paper
Code

Speech Enhancement

Benchmarks Add a Result

Libraries

Datasets

Subtasks

Latest papers

Content

Benchmarks

Add a Result