no code implementations • NAACL 2022 • Saiteja Kosgi, Sarath Sivaprasad, Niranjan Pedanekar, Anil Nelakanti, Vineet Gandhi
We present a method to control the emotional prosody of Text to Speech (TTS) systems by using phoneme-level intermediate features (pitch, energy, and duration) as levers.
no code implementations • 19 May 2023 • Neil Shah, Vishal Tambrahalli, Saiteja Kosgi, Niranjan Pedanekar, Vineet Gandhi
We present MParrotTTS, a unified multilingual, multi-speaker text-to-speech (TTS) synthesis model that can produce high-quality speech.
no code implementations • 1 Mar 2023 • Neil Shah, Saiteja Kosgi, Vishal Tambrahalli, Neha Sahipjohn, Niranjan Pedanekar, Vineet Gandhi
We present ParrotTTS, a modularized text-to-speech synthesis model leveraging disentangled self-supervised speech representations.
no code implementations • 7 Nov 2021 • Sarath Sivaprasad, Saiteja Kosgi, Vineet Gandhi
The proposed TTS system can generate speech from the text in any speaker's style, with fine control of emotion.
no code implementations • 15 Oct 2021 • Sarath Sivaprasad, Akshay Goindani, Vaibhav Garg, Ritam Basu, Saiteja Kosgi, Vineet Gandhi
We find that the presence of multiple domains incentivizes domain agnostic learning and is the primary reason for generalization in Tradition DG.