Search Results for author: Kayoko Yanagisawa

Found 8 papers, 0 papers with code

Creating New Voices using Normalizing Flows

no code implementations • 22 Dec 2023 • Piotr Bilinski, Thomas Merritt, Abdelhamid Ezzerg, Kamil Pokora, Sebastian Cygert, Kayoko Yanagisawa, Roberto Barra-Chicote, Daniel Korzekwa

As there is growing interest in synthesizing voices of new speakers, here we investigate the ability of normalizing flows in text-to-speech (TTS) and voice conversion (VC) modes to extrapolate from speakers observed during training to create unseen speaker identities.

Speech Synthesis Voice Conversion

Paper
Add Code

Cross-lingual Knowledge Distillation via Flow-based Voice Conversion for Robust Polyglot Text-To-Speech

no code implementations • 15 Sep 2023 • Dariusz Piotrowski, Renard Korzeniowski, Alessio Falai, Sebastian Cygert, Kamil Pokora, Georgi Tinchev, Ziyao Zhang, Kayoko Yanagisawa

In the first two stages, we use a VC model to convert utterances in the target locale to the voice of the target speaker.

Knowledge Distillation Speech Synthesis +1

Paper
Add Code

Comparing normalizing flows and diffusion models for prosody and acoustic modelling in text-to-speech

no code implementations • 31 Jul 2023 • Guangyan Zhang, Thomas Merritt, Manuel Sam Ribeiro, Biel Tura-Vecino, Kayoko Yanagisawa, Kamil Pokora, Abdelhamid Ezzerg, Sebastian Cygert, Ammar Abbas, Piotr Bilinski, Roberto Barra-Chicote, Daniel Korzekwa, Jaime Lorenzo-Trueba

Neural text-to-speech systems are often optimized on L1/L2 losses, which make strong assumptions about the distributions of the target data space.

Acoustic Modelling Speech Synthesis +1

Paper
Add Code

Modelling low-resource accents without accent-specific TTS frontend

no code implementations • 11 Jan 2023 • Georgi Tinchev, Marta Czarnowska, Kamil Deja, Kayoko Yanagisawa, Marius Cotescu

Prior work on modelling accents assumes a phonetic transcription is available for the target accent, which might not be the case for low-resource, regional accents.

Voice Conversion

Paper
Add Code

Remap, warp and attend: Non-parallel many-to-many accent conversion with Normalizing Flows

no code implementations • 10 Nov 2022 • Abdelhamid Ezzerg, Thomas Merritt, Kayoko Yanagisawa, Piotr Bilinski, Magdalena Proszewska, Kamil Pokora, Renard Korzeniowski, Roberto Barra-Chicote, Daniel Korzekwa

Regional accents of the same language affect not only how words are pronounced (i. e., phonetic content), but also impact prosodic aspects of speech such as speaking rate and intonation.

Paper
Add Code

Mix and Match: An Empirical Study on Training Corpus Composition for Polyglot Text-To-Speech (TTS)

no code implementations • 4 Jul 2022 • Ziyao Zhang, Alessio Falai, Ariadna Sanchez, Orazio Angelini, Kayoko Yanagisawa

Training multilingual Neural Text-To-Speech (NTTS) models using only monolingual corpora has emerged as a popular way for building voice cloning based Polyglot NTTS systems.

Speech Synthesis Voice Cloning

Paper
Add Code

Unify and Conquer: How Phonetic Feature Representation Affects Polyglot Text-To-Speech (TTS)

no code implementations • 4 Jul 2022 • Ariadna Sanchez, Alessio Falai, Ziyao Zhang, Orazio Angelini, Kayoko Yanagisawa

In this paper, we conduct a comprehensive study comparing multilingual NTTS systems models trained with both representations.

Paper
Add Code

Singing Synthesis: with a little help from my attention

no code implementations • 12 Dec 2019 • Orazio Angelini, Alexis Moinet, Kayoko Yanagisawa, Thomas Drugman

We present UTACO, a singing synthesis model based on an attention-based sequence-to-sequence mechanism and a vocoder based on dilated causal convolutions.

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.