no code implementations • 31 Mar 2024 • Taekyung Ki, Dongchan Min, Gyeongsu Chae
In this paper, we present Export3D, a one-shot 3D-aware portrait animation method that is able to control the facial expression and camera view of a given portrait image.
no code implementations • 30 May 2023 • Doyeon Kim, Eunji Ko, Hyunsu Kim, Yunji Kim, Junho Kim, Dongchan Min, Junmo Kim, Sung Ju Hwang
Portrait stylization, which translates a real human face image into an artistically stylized image, has attracted considerable interest and many prior works have shown impressive quality in recent years.
no code implementations • ICCV 2023 • Taekyung Ki, Dongchan Min
In this paper, we present StyleLipSync, a style-based personalized lip-sync video generative model that can generate identity-agnostic lip-synchronizing video from arbitrary audio.
no code implementations • 17 Nov 2022 • Minki Kang, Dongchan Min, Sung Ju Hwang
There has been a significant progress in Text-To-Speech (TTS) synthesis technology in recent years, thanks to the advancement in neural generative modeling.
no code implementations • 23 Aug 2022 • Dongchan Min, Minyoung Song, Eunji Ko, Sung Ju Hwang
We propose StyleTalker, a novel audio-driven talking head generation model that can synthesize a video of a talking person from a single reference image with accurately audio-synced lip shapes, realistic head poses, and eye blinks.
no code implementations • 20 Jun 2022 • Hyunsu Rhee, Dongchan Min, Sunil Hwang, Bruno Andreis, Sung Ju Hwang
Real-time video segmentation is a crucial task for many real-world applications such as autonomous driving and robot control.
2 code implementations • 6 Jun 2021 • Dongchan Min, Dong Bok Lee, Eunho Yang, Sung Ju Hwang
In this work, we propose StyleSpeech, a new TTS model which not only synthesizes high-quality speech but also effectively adapts to new speakers.
1 code implementation • ICLR 2021 • Dong Bok Lee, Dongchan Min, Seanie Lee, Sung Ju Hwang
Then, the learned model can be used for downstream few-shot classification tasks, where we obtain task-specific parameters by performing semi-supervised EM on the latent representations of the support and query set, and predict labels of the query set by computing aggregated posteriors.