no code implementations • 30 Nov 2022 • Byoung Jin Choi, Myeonghun Jeong, Joun Yeop Lee, Nam Soo Kim
Zero-shot multi-speaker text-to-speech (ZSM-TTS) models aim to generate a speech sample with the voice characteristic of an unseen speaker.
no code implementations • 12 Oct 2022 • Byoung Jin Choi, Myeonghun Jeong, Minchan Kim, Sung Hwan Mun, Nam Soo Kim
Several recently proposed text-to-speech (TTS) models achieved to generate the speech samples with the human-level quality in the single-speaker and multi-speaker TTS scenarios with a set of pre-defined speakers.
no code implementations • 29 Mar 2022 • Minchan Kim, Myeonghun Jeong, Byoung Jin Choi, Sunghwan Ahn, Joun Yeop Lee, Nam Soo Kim
The experimental results verify the effectiveness of the proposed method in terms of naturalness, intelligibility, and speaker generalization.
1 code implementation • 3 Apr 2021 • Myeonghun Jeong, Hyeongju Kim, Sung Jun Cheon, Byoung Jin Choi, Nam Soo Kim
Although neural text-to-speech (TTS) models have attracted a lot of attention and succeeded in generating human-like speech, there is still room for improvements to its naturalness and architectural efficiency.