2 code implementations • 24 Feb 2023 • Junhyeok Lee, Wonbin Jung, Hyunjae Cho, Jaeyeon Kim, Jaehwan Kim
Previous pitch-controllable text-to-speech (TTS) models rely on directly modeling fundamental frequency, leading to low variance in synthesized speech.
2 code implementations • 8 Nov 2022 • Junhyeok Lee, Seungu Han, Hyunjae Cho, Wonbin Jung
Previous generative adversarial network (GAN)-based neural vocoders are trained to reconstruct the exact ground truth waveform from the paired mel-spectrogram and do not consider the one-to-many relationship of speech synthesis.
no code implementations • 24 Jun 2022 • Hyunjae Cho, Wonbin Jung, Junhyeok Lee, Sang Hoon Woo
By the difficulty of obtaining multilingual corpus for given speaker, training multilingual TTS model with monolingual corpora is unavoidable.
no code implementations • CVPR 2022 • Hyoung-Kyu Song, Sang Hoon Woo, Junhyeok Lee, Seungmin Yang, Hyunjae Cho, Youseong Lee, Dongho Choi, Kang-wook Kim
In this work, we propose a joint system combining a talking face generation system with a text-to-speech system that can generate multilingual talking face videos from only the text input.