Prosody Transfer in Neural Text to Speech Using Global Pitch and Loudness Features

21 Nov 2019Siddharth GururaniKilol GuptaDhaval ShahZahra ShakeriJervis Pinto

This paper presents a simple yet effective method to achieve prosody transfer from a reference speech signal to synthesized speech. The main idea is to incorporate well-known acoustic correlates of prosody such as pitch and loudness contours of the reference speech into a modern neural text-to-speech (TTS) synthesizer such as Tacotron2 (TC2)... (read more)

PDF Abstract


No code implementations yet. Submit your code now


Results from the Paper

  Submit results from this paper to get state-of-the-art GitHub badges and help the community compare results to other papers.