Search Results for author: Heeseung Kim

Found 8 papers, 2 papers with code

Unified Speech-Text Pretraining for Spoken Dialog Modeling

no code implementations8 Feb 2024 Heeseung Kim, Soonshin Seo, Kyeongseok Jeong, Ohsung Kwon, Jungwhan Kim, Jaehong Lee, Eunwoo Song, Myungwoo Oh, Sungroh Yoon, Kang Min Yoo

While recent work shows promising results in expanding the capabilities of large language models (LLM) to directly understand and synthesize speech, an LLM-based strategy for modeling spoken dialogs remains elusive and calls for further investigation.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +1

Edit-A-Video: Single Video Editing with Object-Aware Consistency

no code implementations14 Mar 2023 Chaehun Shin, Heeseung Kim, Che Hyun Lee, Sang-gil Lee, Sungroh Yoon

Despite the fact that text-to-video (TTV) model has recently achieved remarkable success, there have been few approaches on TTV for its extension to video editing.

Video Editing

Guided-TTS 2: A Diffusion Model for High-quality Adaptive Text-to-Speech with Untranscribed Data

no code implementations30 May 2022 Sungwon Kim, Heeseung Kim, Sungroh Yoon

We train the speaker-conditional diffusion model on large-scale untranscribed datasets for a classifier-free guidance method and further fine-tune the diffusion model on the reference speech of the target speaker for adaptation, which only takes 40 seconds.

Guided-TTS: A Diffusion Model for Text-to-Speech via Classifier Guidance

no code implementations23 Nov 2021 Heeseung Kim, Sungwon Kim, Sungroh Yoon

For TTS synthesis, we guide the generative process of the diffusion model with a phoneme classifier trained on a large-scale speech recognition dataset.

speech-recognition Speech Recognition +2

Guided-TTS:Text-to-Speech with Untranscribed Speech

no code implementations29 Sep 2021 Heeseung Kim, Sungwon Kim, Sungroh Yoon

By modeling the unconditional distribution for speech, our model can utilize the untranscribed data for training.

Speech Synthesis Text-To-Speech Synthesis

Stein Latent Optimization for Generative Adversarial Networks

1 code implementation ICLR 2022 Uiwon Hwang, Heeseung Kim, Dahuin Jung, Hyemi Jang, Hyungyu Lee, Sungroh Yoon

Generative adversarial networks (GANs) with clustered latent spaces can perform conditional generation in a completely unsupervised manner.

Attribute

Cannot find the paper you are looking for? You can Submit a new open access paper.