Search Results for author: Julius Richter

Found 13 papers, 8 papers with code

Diffusion Models for Audio Restoration

no code implementations • 15 Feb 2024 • Jean-Marie Lemercier, Julius Richter, Simon Welker, Eloi Moliner, Vesa Välimäki, Timo Gerkmann

Here, we aim to show that diffusion models can combine the best of both worlds and offer the opportunity to design audio restoration algorithms with a good degree of interpretability and a remarkable performance in terms of sound quality.

Speech Enhancement

Paper
Add Code

Single and Few-step Diffusion for Generative Speech Enhancement

1 code implementation • 18 Sep 2023 • Bunlong Lay, Jean-Marie Lemercier, Julius Richter, Timo Gerkmann

While the performance of usual generative diffusion algorithms drops dramatically when lowering the number of function evaluations (NFEs) to obtain single-step diffusion, we show that our proposed method keeps a steady performance and therefore largely outperforms the diffusion baseline in this setting and also generalizes better than its predictive counterpart.

Denoising Speech Enhancement

Paper
Code

On the Behavior of Intrusive and Non-intrusive Speech Enhancement Metrics in Predictive and Generative Settings

no code implementations • 5 Jun 2023 • Danilo de Oliveira, Julius Richter, Jean-Marie Lemercier, Tal Peer, Timo Gerkmann

Since its inception, the field of deep speech enhancement has been dominated by predictive (discriminative) approaches, such as spectral mapping or masking.

Denoising Speech Enhancement

Paper
Add Code

Audio-Visual Speech Enhancement with Score-Based Generative Models

no code implementations • 2 Jun 2023 • Julius Richter, Simone Frintrop, Timo Gerkmann

This paper introduces an audio-visual speech enhancement system that leverages score-based generative models, also known as diffusion models, conditioned on visual information.

Automatic Speech Recognition Lipreading +3

Paper
Add Code

Audio-Visual Speech Separation in Noisy Environments with a Lightweight Iterative Model

1 code implementation • 31 May 2023 • Héctor Martel, Julius Richter, Kai Li, Xiaolin Hu, Timo Gerkmann

We propose Audio-Visual Lightweight ITerative model (AVLIT), an effective and lightweight neural network that uses Progressive Learning (PL) to perform audio-visual speech separation in noisy environments.

Speech Separation

Paper
Code

Speech Signal Improvement Using Causal Generative Diffusion Models

no code implementations • 15 Mar 2023 • Julius Richter, Simon Welker, Jean-Marie Lemercier, Bunlong Lay, Tal Peer, Timo Gerkmann

In this paper, we present a causal speech signal improvement system that is designed to handle different types of distortions.

Paper
Add Code

Reducing the Prior Mismatch of Stochastic Differential Equations for Diffusion-based Speech Enhancement

2 code implementations • 28 Feb 2023 • Bunlong Lay, Simon Welker, Julius Richter, Timo Gerkmann

Recently, score-based generative models have been successfully employed for the task of speech enhancement.

Speech Enhancement

385

Paper
Code

StoRM: A Diffusion-based Stochastic Regeneration Model for Speech Enhancement and Dereverberation

2 code implementations • 22 Dec 2022 • Jean-Marie Lemercier, Julius Richter, Simon Welker, Timo Gerkmann

As diffusion models are generative approaches they may also produce vocalizing and breathing artifacts in adverse conditions.

Speech Dereverberation

385

Paper
Code

Analysing Diffusion-based Generative Approaches versus Discriminative Approaches for Speech Restoration

1 code implementation • 4 Nov 2022 • Jean-Marie Lemercier, Julius Richter, Simon Welker, Timo Gerkmann

In this paper, we systematically compare the performance of generative diffusion models and discriminative approaches on different speech restoration tasks.

Bandwidth Extension Speech Denoising +1

385

Paper
Code

Speech Enhancement and Dereverberation with Diffusion-based Generative Models

1 code implementation • IEEE/ACM Transactions on Audio, Speech, and Language Processing 2023 • Julius Richter, Simon Welker, Jean-Marie Lemercier, Bunlong Lay, Timo Gerkmann

This matches our forward process which moves from clean speech to noisy speech by including a drift term.

Ranked #19 on Speech Enhancement on VoiceBank + DEMAND

Speech Dereverberation

385

Paper
Code

Speech Enhancement with Score-Based Generative Models in the Complex STFT Domain

1 code implementation • 31 Mar 2022 • Simon Welker, Julius Richter, Timo Gerkmann

Score-based generative models (SGMs) have recently shown impressive results for difficult generative tasks such as the unconditional and conditional generation of natural images and audio signals.

Speech Enhancement

385

Paper
Code

Disentanglement Learning for Variational Autoencoders Applied to Audio-Visual Speech Enhancement

1 code implementation • 19 May 2021 • Guillaume Carbajal, Julius Richter, Timo Gerkmann

In this work, we propose to use an adversarial training scheme for variational autoencoders to disentangle the label from the other latent variables.

Attribute Disentanglement +1

Paper
Code

Guided Variational Autoencoder for Speech Enhancement With a Supervised Classifier

no code implementations • 12 Feb 2021 • Guillaume Carbajal, Julius Richter, Timo Gerkmann

In this paper, we propose to guide the variational autoencoder with a supervised classifier separately trained on noisy speech.

Speech Enhancement

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.