2 code implementations • 27 Jun 2022 • Taejun Bak, Junmo Lee, Hanbin Bae, Jinhyeok Yang, Jae-Sung Bae, Young-Sun Joo
Therefore, in this paper, we investigate the relationship between these artifacts and GAN-based vocoders and propose a GAN-based vocoder, called Avocodo, that allows the synthesis of high-fidelity speech with reduced artifacts.
no code implementations • 4 Mar 2021 • Hanbin Bae, Jae-Sung Bae, Young-Sun Joo, Young-Ik Kim, Hoon-Young Cho
Second, the GST-TTS model with an auxiliary quality classifier is trained with the filtered speech and a small amount of clean speech.
no code implementations • 29 Jun 2021 • Jae-Sung Bae, Tae-Jun Bak, Young-Sun Joo, Hoon-Young Cho
Therefore, to improve the modeling performance of the TNA-TTS model we propose a hierarchical Transformer structure-based text encoder and audio decoder that are designed to accommodate the characteristics of each module.
no code implementations • 8 Apr 2022 • Jae-Sung Bae, Jinhyeok Yang, Tae-Jun Bak, Young-Sun Joo
This paper proposes a hierarchical and multi-scale variational autoencoder-based non-autoregressive text-to-speech model (HiMuV-TTS) to generate natural speech with diverse speaking styles.
no code implementations • 12 Apr 2022 • Hanbin Bae, Young-Sun Joo
To address this issue, we propose two algorithms to improve the robustness of FastPitch.