Search Results for author: Jean-Marc Valin

Found 22 papers, 7 papers with code

Noise-Robust DSP-Assisted Neural Pitch Estimation with Very Low Complexity

no code implementations25 Sep 2023 Krishna Subramani, Jean-Marc Valin, Jan Buethe, Paris Smaragdis, Mike Goodwin

Pitch estimation is an essential step of many speech processing algorithms, including speech coding, synthesis, and enhancement.

LACE: A light-weight, causal model for enhancing coded speech through adaptive convolutions

no code implementations13 Jul 2023 Jan Büthe, Jean-Marc Valin, Ahmed Mustafa

Classical speech coding uses low-complexity postfilters with zero lookahead to enhance the quality of coded speech, but their effectiveness is limited by their simplicity.

A Framework for Unified Real-time Personalized and Non-Personalized Speech Enhancement

no code implementations23 Feb 2023 Zhepei Wang, Ritwik Giri, Devansh Shah, Jean-Marc Valin, Michael M. Goodwin, Paris Smaragdis

In this study, we present an approach to train a single speech enhancement network that can perform both personalized and non-personalized speech enhancement.

Multi-Task Learning Speech Enhancement

Framewise WaveGAN: High Speed Adversarial Vocoder in Time Domain with Very Low Computational Complexity

no code implementations8 Dec 2022 Ahmed Mustafa, Jean-Marc Valin, Jan Büthe, Paris Smaragdis, Mike Goodwin

GAN vocoders are currently one of the state-of-the-art methods for building high-quality neural waveform generative models.

Real-Time Packet Loss Concealment With Mixed Generative and Predictive Model

1 code implementation11 May 2022 Jean-Marc Valin, Ahmed Mustafa, Christopher Montgomery, Timothy B. Terriberry, Michael Klingbeil, Paris Smaragdis, Arvindh Krishnaswamy

As deep speech enhancement algorithms have recently demonstrated capabilities greatly surpassing their traditional counterparts for suppressing noise, reverberation and echo, attention is turning to the problem of packet loss concealment (PLC).

Packet Loss Concealment Speech Synthesis

End-to-end LPCNet: A Neural Vocoder With Fully-Differentiable LPC Estimation

1 code implementation23 Feb 2022 Krishna Subramani, Jean-Marc Valin, Umut Isik, Paris Smaragdis, Arvindh Krishnaswamy

Neural vocoders have recently demonstrated high quality speech synthesis, but typically require a high computational complexity.

Speech Synthesis

Neural Speech Synthesis on a Shoestring: Improving the Efficiency of LPCNet

2 code implementations22 Feb 2022 Jean-Marc Valin, Umut Isik, Paris Smaragdis, Arvindh Krishnaswamy

Neural speech synthesis models can synthesize high quality speech but typically require a high computational complexity to do so.

Speech Synthesis

Multi-channel Opus compression for far-field automatic speech recognition with a fixed bitrate budget

no code implementations15 Jun 2021 Lukas Drude, Jahn Heymann, Andreas Schwarz, Jean-Marc Valin

Automatic speech recognition (ASR) in the cloud allows the use of larger models and more powerful multi-channel signal processing front-ends compared to on-device processing.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +1

Personalized PercepNet: Real-time, Low-complexity Target Voice Separation and Enhancement

no code implementations8 Jun 2021 Ritwik Giri, Shrikant Venkataramani, Jean-Marc Valin, Umut Isik, Arvindh Krishnaswamy

The presence of multiple talkers in the surrounding environment poses a difficult challenge for real-time speech communication systems considering the constraints on network size and complexity.

Semi-Supervised Singing Voice Separation with Noisy Self-Training

no code implementations16 Feb 2021 Zhepei Wang, Ritwik Giri, Umut Isik, Jean-Marc Valin, Arvindh Krishnaswamy

Given a limited set of labeled data, we present a method to leverage a large volume of unlabeled data to improve the model's performance.

Data Augmentation

Enhancing into the codec: Noise Robust Speech Coding with Vector-Quantized Autoencoders

no code implementations12 Feb 2021 Jonah Casebeer, Vinjai Vale, Umut Isik, Jean-Marc Valin, Ritwik Giri, Arvindh Krishnaswamy

Audio codecs based on discretized neural autoencoders have recently been developed and shown to provide significantly higher compression levels for comparable quality speech output.

PoCoNet: Better Speech Enhancement with Frequency-Positional Embeddings, Semi-Supervised Conversational Data, and Biased Loss

no code implementations11 Aug 2020 Umut Isik, Ritwik Giri, Neerad Phansalkar, Jean-Marc Valin, Karim Helwani, Arvindh Krishnaswamy

Neural network applications generally benefit from larger-sized models, but for current speech enhancement models, larger scale networks often suffer from decreased robustness to the variety of real-world use cases beyond what is encountered in training data.

Speech Enhancement

A Real-Time Wideband Neural Vocoder at 1.6 kb/s Using LPCNet

2 code implementations28 Mar 2019 Jean-Marc Valin, Jan Skoglund

We demonstrate that LPCNet operating at 1. 6 kb/s achieves significantly higher quality than MELP and that uncompressed LPCNet can exceed the quality of a waveform codec operating at low bitrate.

Speech Synthesis

LPCNet: Improving Neural Speech Synthesis Through Linear Prediction

2 code implementations28 Oct 2018 Jean-Marc Valin, Jan Skoglund

We demonstrate that LPCNet can achieve significantly higher quality than WaveRNN for the same network size and that high quality LPCNet speech synthesis is achievable with a complexity under 3 GFLOPS.

Speech Synthesis Text to Speech

A Hybrid DSP/Deep Learning Approach to Real-Time Full-Band Speech Enhancement

2 code implementations24 Sep 2017 Jean-Marc Valin

Despite noise suppression being a mature area in signal processing, it remains highly dependent on fine tuning of estimator algorithms and parameters.

Sound Audio and Speech Processing

Cannot find the paper you are looking for? You can Submit a new open access paper.