no code implementations • 29 Nov 2022 • Nikolaos Ellinas, Myrsini Christidou, Alexandra Vioni, June Sig Sung, Aimilios Chalamandaris, Pirros Tsiakoulis, Paris Mastorocostas
The final model enables fine-grained phoneme-level prosody control for all speakers contained in the training set, while maintaining the speaker identity.
no code implementations • 19 Nov 2021 • Myrsini Christidou, Alexandra Vioni, Nikolaos Ellinas, Georgios Vamvoukakis, Konstantinos Markopoulos, Panos Kakoulidis, June Sig Sung, Hyoungmin Park, Aimilios Chalamandaris, Pirros Tsiakoulis
This paper presents a method for phoneme-level prosody control of F0 and duration on a multispeaker text-to-speech setup, which is based on prosodic clustering.
no code implementations • 19 Nov 2021 • Alexandra Vioni, Myrsini Christidou, Nikolaos Ellinas, Georgios Vamvoukakis, Panos Kakoulidis, TaeHoon Kim, June Sig Sung, Hyoungmin Park, Aimilios Chalamandaris, Pirros Tsiakoulis
This paper presents a method for controlling the prosody at the phoneme level in an autoregressive attention-based text-to-speech system.
no code implementations • 17 Nov 2021 • Konstantinos Markopoulos, Nikolaos Ellinas, Alexandra Vioni, Myrsini Christidou, Panos Kakoulidis, Georgios Vamvoukakis, Georgia Maniati, June Sig Sung, Hyoungmin Park, Pirros Tsiakoulis, Aimilios Chalamandaris
In this paper, a text-to-rapping/singing system is introduced, which can be adapted to any speaker's voice.