Speech Enhancement

217 papers with code • 12 benchmarks • 19 datasets

Speech Enhancement is a signal processing task that involves improving the quality of speech signals captured under noisy or degraded conditions. The goal of speech enhancement is to make speech signals clearer, more intelligible, and more pleasant to listen to, which can be used for various applications such as voice recognition, teleconferencing, and hearing aids.

( Image credit: A Fully Convolutional Neural Network For Speech Enhancement )

Libraries

Use these libraries to find Speech Enhancement models and implementations
4 papers
485
3 papers
7,858
See all 10 libraries.

FSPEN: AN ULTRA-LIGHTWEIGHT NETWORK FOR REAL TIME SPEECH ENAHNCMENT

gitwukeyi/FSPEN Conference 2024

Deep learning-based speech enhancement methods have shown promising result in recent years.

1
15 Apr 2024

How to train your ears: Auditory-model emulation for large-dynamic-range inputs and mild-to-severe hearing losses

p-leer/howtotrainyourears 15 Mar 2024

Our results show that this new optimization objective significantly improves the emulation performance of deep neural networks across relevant input sound levels and auditory-model frequency channels, without increasing the computational load during inference.

0
15 Mar 2024

Binaural Speech Enhancement Using Deep Complex Convolutional Transformer Networks

vikastokala/bcctn 8 Mar 2024

Studies have shown that in noisy acoustic environments, providing binaural signals to the user of an assistive listening device may improve speech intelligibility and spatial awareness.

5
08 Mar 2024

Self-Supervised Speech Quality Estimation and Enhancement Using Only Clean Speech

JasonSWFu/VQscore 26 Feb 2024

To improve the robustness of the encoder for SE, a novel self-distillation mechanism combined with adversarial training is introduced.

8
26 Feb 2024

Improving Design of Input Condition Invariant Speech Enhancement

espnet/espnet 25 Jan 2024

In this paper we propose novel architectures to improve the input condition invariant SE model so that performance in simulated conditions remains competitive while real condition degradation is much mitigated.

7,858
25 Jan 2024

A Two-Stage Framework in Cross-Spectrum Domain for Real-Time Speech Enhancement

zhangyuewei98/fdfnet 19 Jan 2024

Two-stage pipeline is popular in speech enhancement tasks due to its superiority over traditional single-stage methods.

4
19 Jan 2024

A Refining Underlying Information Framework for Monaural Speech Enhancement

caoruitju/rui_se 18 Dec 2023

By bridging the speech enhancement and the Information Bottleneck principle in this letter, we rethink a universal plug-and-play strategy and propose a Refining Underlying Information framework called RUI to rise to the challenges both in theory and practice.

35
18 Dec 2023

D4AM: A General Denoising Framework for Downstream Acoustic Models

changlee0903/d4am 28 Nov 2023

To our knowledge, this is the first work that deploys an effective combination scheme of regression (denoising) and classification (ASR) objectives to derive a general pre-processor applicable to various unseen ASR systems.

12
28 Nov 2023

TorchAudio 2.1: Advancing speech recognition, self-supervised learning, and audio processing components for PyTorch

pytorch/audio 27 Oct 2023

TorchAudio is an open-source audio and speech processing library built for PyTorch.

2,379
27 Oct 2023

LauraGPT: Listen, Attend, Understand, and Regenerate Audio with GPT

alibaba-damo-academy/funcodec 7 Oct 2023

In this paper, we propose LauraGPT, a unified GPT model for audio recognition, understanding, and generation.

273
07 Oct 2023