no code implementations • 3 Jan 2024 • Minchan Kim, Myeonghun Jeong, Byoung Jin Choi, Semin Kim, Joun Yeop Lee, Nam Soo Kim
We also delve into the inference speed and prosody control capabilities of our approach, highlighting the potential of neural transducers in TTS frameworks.
no code implementations • 2 Jan 2024 • Myeonghun Jeong, Minchan Kim, Joun Yeop Lee, Nam Soo Kim
We present a fast and high-quality codec language model for parallel audio generation.
no code implementations • 6 Nov 2023 • Minchan Kim, Myeonghun Jeong, Byoung Jin Choi, Dongjune Lee, Nam Soo Kim
We introduce a text-to-speech(TTS) framework based on a neural transducer.
1 code implementation • 30 May 2023 • Sung Hwan Mun, Hye-jin Shim, Hemlata Tak, Xin Wang, Xuechen Liu, Md Sahidullah, Myeonghun Jeong, Min Hyun Han, Massimiliano Todisco, Kong Aik Lee, Junichi Yamagishi, Nicholas Evans, Tomi Kinnunen, Nam Soo Kim, Jee-weon Jung
Second, competitive performance should be demonstrated compared to the fusion of automatic speaker verification (ASV) and countermeasure (CM) embeddings, which outperformed single embedding solutions by a large margin in the SASV2022 challenge.
no code implementations • 30 Nov 2022 • Byoung Jin Choi, Myeonghun Jeong, Joun Yeop Lee, Nam Soo Kim
Zero-shot multi-speaker text-to-speech (ZSM-TTS) models aim to generate a speech sample with the voice characteristic of an unseen speaker.
no code implementations • 12 Oct 2022 • Byoung Jin Choi, Myeonghun Jeong, Minchan Kim, Sung Hwan Mun, Nam Soo Kim
Several recently proposed text-to-speech (TTS) models achieved to generate the speech samples with the human-level quality in the single-speaker and multi-speaker TTS scenarios with a set of pre-defined speakers.
no code implementations • 29 Mar 2022 • Minchan Kim, Myeonghun Jeong, Byoung Jin Choi, Sunghwan Ahn, Joun Yeop Lee, Nam Soo Kim
The experimental results verify the effectiveness of the proposed method in terms of naturalness, intelligibility, and speaker generalization.
1 code implementation • 3 Apr 2021 • Myeonghun Jeong, Hyeongju Kim, Sung Jun Cheon, Byoung Jin Choi, Nam Soo Kim
Although neural text-to-speech (TTS) models have attracted a lot of attention and succeeded in generating human-like speech, there is still room for improvements to its naturalness and architectural efficiency.