1 code implementation • 21 May 2019 • Ohsung Kwon, Eunwoo Song, Jae-Min Kim, Hong-Goo Kang
In this paper, we propose a high-quality generative text-to-speech (TTS) system using an effective spectrum and excitation estimation method.
no code implementations • 30 Jun 2022 • Eunwoo Song, Ryuichi Yamamoto, Ohsung Kwon, Chan-Ho Song, Min-Jae Hwang, Suhyeon Oh, Hyun-Wook Yoon, Jin-Seob Kim, Jae-Min Kim
In the proposed method, we first adopt a variational autoencoder whose posterior distribution is utilized to extract latent features representing acoustic similarity between the recorded and synthetic corpora.
no code implementations • 8 Feb 2024 • Heeseung Kim, Soonshin Seo, Kyeongseok Jeong, Ohsung Kwon, Jungwhan Kim, Jaehong Lee, Eunwoo Song, Myungwoo Oh, Sungroh Yoon, Kang Min Yoo
While recent work shows promising results in expanding the capabilities of large language models (LLM) to directly understand and synthesize speech, an LLM-based strategy for modeling spoken dialogs remains elusive and calls for further investigation.
Automatic Speech Recognition Automatic Speech Recognition (ASR) +1