Search Results for author: Thomas Merritt

Found 13 papers, 3 papers with code

AE-Flow: AutoEncoder Normalizing Flow

no code implementations • 27 Dec 2023 • Jakub Mosiński, Piotr Biliński, Thomas Merritt, Abdelhamid Ezzerg, Daniel Korzekwa

The results show that the proposed training paradigm systematically improves speaker similarity and naturalness when compared to regular training methods of normalizing flows.

Voice Conversion

Paper
Add Code

Creating New Voices using Normalizing Flows

no code implementations • 22 Dec 2023 • Piotr Bilinski, Thomas Merritt, Abdelhamid Ezzerg, Kamil Pokora, Sebastian Cygert, Kayoko Yanagisawa, Roberto Barra-Chicote, Daniel Korzekwa

As there is growing interest in synthesizing voices of new speakers, here we investigate the ability of normalizing flows in text-to-speech (TTS) and voice conversion (VC) modes to extrapolate from speakers observed during training to create unseen speaker identities.

Speech Synthesis Voice Conversion

Paper
Add Code

Comparing normalizing flows and diffusion models for prosody and acoustic modelling in text-to-speech

no code implementations • 31 Jul 2023 • Guangyan Zhang, Thomas Merritt, Manuel Sam Ribeiro, Biel Tura-Vecino, Kayoko Yanagisawa, Kamil Pokora, Abdelhamid Ezzerg, Sebastian Cygert, Ammar Abbas, Piotr Bilinski, Roberto Barra-Chicote, Daniel Korzekwa, Jaime Lorenzo-Trueba

Neural text-to-speech systems are often optimized on L1/L2 losses, which make strong assumptions about the distributions of the target data space.

Acoustic Modelling Speech Synthesis +1

Paper
Add Code

Remap, warp and attend: Non-parallel many-to-many accent conversion with Normalizing Flows

no code implementations • 10 Nov 2022 • Abdelhamid Ezzerg, Thomas Merritt, Kayoko Yanagisawa, Piotr Bilinski, Magdalena Proszewska, Kamil Pokora, Renard Korzeniowski, Roberto Barra-Chicote, Daniel Korzekwa

Regional accents of the same language affect not only how words are pronounced (i. e., phonetic content), but also impact prosodic aspects of speech such as speaking rate and intonation.

Paper
Add Code

GlowVC: Mel-spectrogram space disentangling model for language-independent text-free voice conversion

no code implementations • 4 Jul 2022 • Magdalena Proszewska, Grzegorz Beringer, Daniel Sáez-Trigueros, Thomas Merritt, Abdelhamid Ezzerg, Roberto Barra-Chicote

We evaluate our models in terms of intelligibility, speaker similarity and naturalness for intra- and cross-lingual conversion in seen and unseen languages.

Voice Conversion

Paper
Add Code

Expressive, Variable, and Controllable Duration Modelling in TTS

no code implementations • 28 Jun 2022 • Ammar Abbas, Thomas Merritt, Alexis Moinet, Sri Karlapati, Ewa Muszynska, Simon Slangen, Elia Gatti, Thomas Drugman

First, we propose a duration model conditioned on phrasing that improves the predicted durations and provides better modelling of pauses.

Normalising Flows Speech Synthesis

Paper
Add Code

Text-free non-parallel many-to-many voice conversion using normalising flows

no code implementations • 15 Mar 2022 • Thomas Merritt, Abdelhamid Ezzerg, Piotr Biliński, Magdalena Proszewska, Kamil Pokora, Roberto Barra-Chicote, Daniel Korzekwa

We investigate normalising flows for VC in both text-conditioned and text-free scenarios.

Normalising Flows Speech Synthesis +2

Paper
Add Code

Non-Autoregressive TTS with Explicit Duration Modelling for Low-Resource Highly Expressive Speech

no code implementations • 24 Jun 2021 • Raahil Shah, Kamil Pokora, Abdelhamid Ezzerg, Viacheslav Klimkov, Goeric Huybrechts, Bartosz Putrycz, Daniel Korzekwa, Thomas Merritt

In this paper, we present a method for building highly expressive TTS voices with as little as 15 minutes of speech data from the target speaker.

Generative Adversarial Network

Paper
Add Code

Low-resource expressive text-to-speech using data augmentation

no code implementations • 11 Nov 2020 • Goeric Huybrechts, Thomas Merritt, Giulia Comini, Bartek Perz, Raahil Shah, Jaime Lorenzo-Trueba

While recent neural text-to-speech (TTS) systems perform remarkably well, they typically require a substantial amount of recordings from the target speaker reading in the desired speaking style.

Data Augmentation Voice Conversion

Paper
Add Code

Towards achieving robust universal neural vocoding

1 code implementation • 4 Jul 2019 • Jaime Lorenzo-Trueba, Thomas Drugman, Javier Latorre, Thomas Merritt, Bartosz Putrycz, Roberto Barra-Chicote, Alexis Moinet, Vatsal Aggarwal

This vocoder is shown to be capable of generating speech of consistently good quality (98% relative mean MUSHRA when compared to natural speech) regardless of whether the input spectrogram comes from a speaker or style seen during training or from an out-of-domain scenario when the recording conditions are studio-quality.

235

Paper
Code

In Other News: A Bi-style Text-to-speech Model for Synthesizing Newscaster Voice with Limited Data

1 code implementation • NAACL 2019 • Nishant Prateek, Mateusz Łajszczak, Roberto Barra-Chicote, Thomas Drugman, Jaime Lorenzo-Trueba, Thomas Merritt, Srikanth Ronanki, Trevor Wood

Neural text-to-speech synthesis (NTTS) models have shown significant progress in generating high-quality speech, however they require a large quantity of training data.

Speech Synthesis Text-To-Speech Synthesis +1

Paper
Code

Effect of data reduction on sequence-to-sequence neural TTS

no code implementations • 15 Nov 2018 • Javier Latorre, Jakub Lachowicz, Jaime Lorenzo-Trueba, Thomas Merritt, Thomas Drugman, Srikanth Ronanki, Klimkov Viacheslav

Recent speech synthesis systems based on sampling from autoregressive neural networks models can generate speech almost undistinguishable from human recordings.

Speech Synthesis

Paper
Add Code

Robust universal neural vocoding

8 code implementations • 15 Nov 2018 • Jaime Lorenzo-Trueba, Thomas Drugman, Javier Latorre, Thomas Merritt, Bartosz Putrycz, Roberto Barra-Chicote

This paper introduces a robust universal neural vocoder trained with 74 speakers (comprised of both genders) coming from 17 languages.

310

Paper
Code

Cannot find the paper you are looking for? You can Submit a new open access paper.