We propose a novel Multi-Scale Spectrogram (MSS) modelling approach to synthesise speech with an improved coarse and fine-grained prosody.
Many factors influence speech yielding different renditions of a given sentence.
In this paper we investigate the use of a sentence-level conditioning vector to improve the signal quality of a Parallel WaveNet neural vocoder.
In Stage II, we propose a novel method to sample from this learnt prosodic distribution using the contextual information available in text.
This paper addresses the problem of estimating the voice source directly from speech waveforms.
The source signal is obtained by concatenating excitation frames picked up from the codebook, based on a selection criterion and taking target residual coefficients as input.
We present UTACO, a singing synthesis model based on an attention-based sequence-to-sequence mechanism and a vocoder based on dilated causal convolutions.
We present an approach to synthesize whisper by applying a handcrafted signal processing recipe and Voice Conversion (VC) techniques to convert normally phonated speech to whispered speech.
This vocoder is shown to be capable of generating speech of consistently good quality (98% relative mean MUSHRA when compared to natural speech) regardless of whether the input spectrogram comes from a speaker or style seen during training or from an out-of-domain scenario when the recording conditions are studio-quality.
no code implementations • 19 Jan 2018 • Matei Mancas, Christian Frisson, Joëlle Tilmanne, Nicolas D'Alessandro, Petr Barborka, Furkan Bayansar, Francisco Bernard, Rebecca Fiebrink, Alexis Heloir, Edgar Hemery, Sohaib Laraba, Alexis Moinet, Fabrizio Nunnari, Thierry Ravet, Loïc Reboursière, Alvaro Sarasua, Mickaël Tits, Noé Tits, François Zajéga, Paolo Alborno, Ksenia Kolykhalova, Emma Frid, Damiano Malafronte, Lisanne Huis in't Veld, Hüseyin Cakmak, Kevin El Haddad, Nicolas Riche, Julien Leroy, Pierre Marighetto, Bekir Berker Türker, Hossein Khaki, Roberto Pulisci, Emer Gilmartin, Fasih Haider, Kübra Cengiz, Martin Sulir, Ilaria Torre, Shabbir Marzban, Ramazan Yazıcı, Furkan Burak Bâgcı, Vedat Gazi Kılı, Hilal Sezer, Sena Büsra Yenge, Charles-Alexandre Delestage, Sylvie Leleu-Merviel, Muriel Meyer-Chemenska, Daniel Schmitt, Willy Yvart, Stéphane Dupont, Ozan Can Altiok, Aysegül Bumin, Ceren Dikmen, Ivan Giangreco, Silvan Heller, Emre Külah, Gueorgui Pironkov, Luca Rossetto, Yusuf Sahillioglu, Heiko Schuldt, Omar Seddati, Yusuf Setinkaya, Metin Sezgin, Claudiu Tanase, Emre Toyan, Sean Wood, Doguhan Yeke, Françcois Rocca, Pierre-Henri De Deken, Alessandra Bandrabur, Fabien Grisard, Axel Jean-Caurant, Vincent Courboulay, Radhwan Ben Madhkour, Ambroise Moreau
The 11th Summer Workshop on Multimodal Interfaces eNTERFACE 2015 was hosted by the Numediart Institute of Creative Technologies of the University of Mons from August 10th to September 2015.