Search Results for author: Byoung Jin Choi

Found 8 papers, 2 papers with code

Utilizing Neural Transducers for Two-Stage Text-to-Speech via Semantic Token Prediction

no code implementations3 Jan 2024 Minchan Kim, Myeonghun Jeong, Byoung Jin Choi, Semin Kim, Joun Yeop Lee, Nam Soo Kim

We also delve into the inference speed and prosody control capabilities of our approach, highlighting the potential of neural transducers in TTS frameworks.

SNAC: Speaker-normalized affine coupling layer in flow-based architecture for zero-shot multi-speaker text-to-speech

no code implementations30 Nov 2022 Byoung Jin Choi, Myeonghun Jeong, Joun Yeop Lee, Nam Soo Kim

Zero-shot multi-speaker text-to-speech (ZSM-TTS) models aim to generate a speech sample with the voice characteristic of an unseen speaker.

Speech Synthesis

Adversarial Speaker-Consistency Learning Using Untranscribed Speech Data for Zero-Shot Multi-Speaker Text-to-Speech

no code implementations12 Oct 2022 Byoung Jin Choi, Myeonghun Jeong, Minchan Kim, Sung Hwan Mun, Nam Soo Kim

Several recently proposed text-to-speech (TTS) models achieved to generate the speech samples with the human-level quality in the single-speaker and multi-speaker TTS scenarios with a set of pre-defined speakers.

Diff-TTS: A Denoising Diffusion Model for Text-to-Speech

1 code implementation3 Apr 2021 Myeonghun Jeong, Hyeongju Kim, Sung Jun Cheon, Byoung Jin Choi, Nam Soo Kim

Although neural text-to-speech (TTS) models have attracted a lot of attention and succeeded in generating human-like speech, there is still room for improvements to its naturalness and architectural efficiency.

Denoising Speech Synthesis

Expressive Text-to-Speech using Style Tag

no code implementations1 Apr 2021 Minchan Kim, Sung Jun Cheon, Byoung Jin Choi, Jong Jin Kim, Nam Soo Kim

In this work, we propose StyleTagging-TTS (ST-TTS), a novel expressive TTS model that utilizes a style tag written in natural language.

Language Modelling TAG

WaveNODE: A Continuous Normalizing Flow for Speech Synthesis

1 code implementation8 Jun 2020 Hyeongju Kim, Hyeonseung Lee, Woo Hyun Kang, Sung Jun Cheon, Byoung Jin Choi, Nam Soo Kim

In recent years, various flow-based generative models have been proposed to generate high-fidelity waveforms in real-time.

Speech Synthesis

Cannot find the paper you are looking for? You can Submit a new open access paper.