no code implementations • 3 Apr 2024 • Jaehyeon Kim, Keon Lee, Seungjun Chung, Jaewoong Cho
With the emergence of neural audio codecs, which encode multiple streams of discrete tokens from audio, large language models have recently gained attention as a promising approach for zero-shot Text-to-Speech (TTS) synthesis.
1 code implementation • 12 Jul 2023 • Jaewoong Cho, Kartik Sreenivasan, Keon Lee, Kyunghoo Mun, Soheun Yi, Jeong-Gwan Lee, Anna Lee, Jy-yong Sohn, Dimitris Papailiopoulos, Kangwook Lee
Contrastive learning has gained significant attention as a method for self-supervised learning.
no code implementations • 26 Oct 2022 • Kyumin Park, Keon Lee, Daeyoung Kim, Dongyeop Kang
We present a novel speech dataset, RedPen, with human annotations on unnatural speech regions and their corresponding reasons.
1 code implementation • 3 Jul 2022 • Keon Lee, Kyumin Park, Daeyoung Kim
The majority of current Text-to-Speech (TTS) datasets, which are collections of individual utterances, contain few conversational aspects.
1 code implementation • 17 Mar 2021 • Keon Lee, Kyumin Park, Daeyoung Kim
Previous works on neural text-to-speech (TTS) have been addressed on limited speed in training and inference time, robustness for difficult synthesis conditions, expressiveness, and controllability.