Search Results for author: Goeric Huybrechts

Found 10 papers, 0 papers with code

DCTX-Conformer: Dynamic context carry-over for low latency unified streaming and non-streaming Conformer ASR

no code implementations • 13 Jun 2023 • Goeric Huybrechts, Srikanth Ronanki, Xilai Li, Hadis Nosrati, Sravan Bodapati, Katrin Kirchhoff

To address this issue, we propose the integration of a novel dynamic contextual carry-over mechanism in a state-of-the-art (SOTA) unified ASR system.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +1

Paper
Add Code

Dynamic Chunk Convolution for Unified Streaming and Non-Streaming Conformer ASR

no code implementations • 18 Apr 2023 • Xilai Li, Goeric Huybrechts, Srikanth Ronanki, Jeff Farris, Sravan Bodapati

Overall, our proposed model reduces the degradation of the streaming mode over the non-streaming full-contextual model from 41. 7% and 45. 7% to 16. 7% and 26. 2% on the LibriSpeech test-clean and test-other datasets respectively, while improving by a relative 15. 5% WER over the previous state-of-the-art unified model.

speech-recognition Speech Recognition

Paper
Add Code

Low-data? No problem: low-resource, language-agnostic conversational text-to-speech via F0-conditioned data augmentation

no code implementations • 29 Jul 2022 • Giulia Comini, Goeric Huybrechts, Manuel Sam Ribeiro, Adam Gabrys, Jaime Lorenzo-Trueba

The availability of data in expressive styles across languages is limited, and recording sessions are costly and time consuming.

Data Augmentation Voice Conversion

Paper
Add Code

Voice Filter: Few-shot text-to-speech speaker adaptation using voice conversion as a post-processing module

no code implementations • 16 Feb 2022 • Adam Gabryś, Goeric Huybrechts, Manuel Sam Ribeiro, Chung-Ming Chien, Julian Roth, Giulia Comini, Roberto Barra-Chicote, Bartek Perz, Jaime Lorenzo-Trueba

It uses voice conversion (VC) as a post-processing module appended to a pre-existing high-quality TTS system and marks a conceptual shift in the existing TTS paradigm, framing the few-shot TTS problem as a VC task.

Speech Synthesis Voice Conversion

Paper
Add Code

Cross-speaker style transfer for text-to-speech using data augmentation

no code implementations • 10 Feb 2022 • Manuel Sam Ribeiro, Julian Roth, Giulia Comini, Goeric Huybrechts, Adam Gabrys, Jaime Lorenzo-Trueba

The proposed approach relies on voice conversion to first generate high-quality data from the set of supporting expressive speakers.

Data Augmentation Style Transfer +1

Paper
Add Code

Non-Autoregressive TTS with Explicit Duration Modelling for Low-Resource Highly Expressive Speech

no code implementations • 24 Jun 2021 • Raahil Shah, Kamil Pokora, Abdelhamid Ezzerg, Viacheslav Klimkov, Goeric Huybrechts, Bartosz Putrycz, Daniel Korzekwa, Thomas Merritt

In this paper, we present a method for building highly expressive TTS voices with as little as 15 minutes of speech data from the target speaker.

Generative Adversarial Network

Paper
Add Code

EmoCat: Language-agnostic Emotional Voice Conversion

no code implementations • 14 Jan 2021 • Bastian Schnell, Goeric Huybrechts, Bartek Perz, Thomas Drugman, Jaime Lorenzo-Trueba

In this work we propose EmoCat, a language-agnostic emotional voice conversion model.

Voice Conversion

Paper
Add Code

Low-resource expressive text-to-speech using data augmentation

no code implementations • 11 Nov 2020 • Goeric Huybrechts, Thomas Merritt, Giulia Comini, Bartek Perz, Raahil Shah, Jaime Lorenzo-Trueba

While recent neural text-to-speech (TTS) systems perform remarkably well, they typically require a substantial amount of recordings from the target speaker reading in the desired speaking style.

Data Augmentation Voice Conversion

Paper
Add Code

Voice Conversion for Whispered Speech Synthesis

no code implementations • 11 Dec 2019 • Marius Cotescu, Thomas Drugman, Goeric Huybrechts, Jaime Lorenzo-Trueba, Alexis Moinet

We present an approach to synthesize whisper by applying a handcrafted signal processing recipe and Voice Conversion (VC) techniques to convert normally phonated speech to whispered speech.

Speech Synthesis Voice Conversion

Paper
Add Code

Traditional Machine Learning for Pitch Detection

no code implementations • 4 Mar 2019 • Thomas Drugman, Goeric Huybrechts, Viacheslav Klimkov, Alexis Moinet

In this paper, we consider voicing detection as a classification problem and F0 contour estimation as a regression problem.

BIG-bench Machine Learning Clustering +1

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.