Search Results for author: Xiaodong Cun

Found 25 papers, 20 papers with code

Make-Your-Video: Customized Video Generation Using Textual and Structural Guidance

no code implementations1 Jun 2023 Jinbo Xing, Menghan Xia, Yuxin Liu, Yuechen Zhang, Yong Zhang, Yingqing He, Hanyuan Liu, Haoxin Chen, Xiaodong Cun, Xintao Wang, Ying Shan, Tien-Tsin Wong

Our method, dubbed Make-Your-Video, involves joint-conditional video generation using a Latent Diffusion Model that is pre-trained for still image synthesis and then promoted for video generation with the introduction of temporal modules.

Inserting Anybody in Diffusion Models via Celeb Basis

1 code implementation1 Jun 2023 Ge Yuan, Xiaodong Cun, Yong Zhang, Maomao Li, Chenyang Qi, Xintao Wang, Ying Shan, Huicheng Zheng

Empowered by the proposed celeb basis, the new identity in our customized model showcases a better concept combination ability than previous personalization methods.

Explicit Visual Prompting for Universal Foreground Segmentations

1 code implementation29 May 2023 Weihuang Liu, Xi Shen, Chi-Man Pun, Xiaodong Cun

We take inspiration from the widely-used pre-training and then prompt tuning protocols in NLP and propose a new visual prompting model, named Explicit Visual Prompting (EVP).

object-detection Object Detection +3

TaleCrafter: Interactive Story Visualization with Multiple Characters

1 code implementation29 May 2023 Yuan Gong, Youxin Pang, Xiaodong Cun, Menghan Xia, Yingqing He, Haoxin Chen, Longyue Wang, Yong Zhang, Xintao Wang, Ying Shan, Yujiu Yang

Accurate Story visualization requires several necessary elements, such as identity consistency across frames, the alignment between plain text and visual content, and a reasonable layout of objects in images.

Story Visualization Text-to-Image Generation

Follow Your Pose: Pose-Guided Text-to-Video Generation using Pose-Free Videos

1 code implementation3 Apr 2023 Yue Ma, Yingqing He, Xiaodong Cun, Xintao Wang, Ying Shan, Xiu Li, Qifeng Chen

Generating text-editable and pose-controllable character videos have an imperious demand in creating various digital human.

Text-to-Image Generation Text-to-Video Generation +1

Explicit Visual Prompting for Low-Level Structure Segmentations

1 code implementation CVPR 2023 Weihuang Liu, Xi Shen, Chi-Man Pun, Xiaodong Cun

Different from the previous visual prompting which is typically a dataset-level implicit embedding, our key insight is to enforce the tunable parameters focusing on the explicit visual content from each individual image, i. e., the features from frozen patch embeddings and the input's high-frequency components.

Visual Prompting

CoordFill: Efficient High-Resolution Image Inpainting via Parameterized Coordinate Querying

2 code implementations15 Mar 2023 Weihuang Liu, Xiaodong Cun, Chi-Man Pun, Menghan Xia, Yong Zhang, Jue Wang

Thanks to the proposed structure, we only encode the high-resolution image in a relatively low resolution for larger reception field capturing.

Image Inpainting Vocal Bursts Intensity Prediction

DPE: Disentanglement of Pose and Expression for General Video Portrait Editing

1 code implementation CVPR 2023 Youxin Pang, Yong Zhang, Weize Quan, Yanbo Fan, Xiaodong Cun, Ying Shan, Dong-Ming Yan

In this paper, we introduce a novel self-supervised disentanglement framework to decouple pose and expression without 3DMMs and paired data, which consists of a motion editing module, a pose generator, and an expression generator.

Disentanglement Talking Face Generation +1

CodeTalker: Speech-Driven 3D Facial Animation with Discrete Motion Prior

1 code implementation CVPR 2023 Jinbo Xing, Menghan Xia, Yuechen Zhang, Xiaodong Cun, Jue Wang, Tien-Tsin Wong

In this paper, we propose to cast speech-driven facial animation as a code query task in a finite proxy space of the learned codebook, which effectively promotes the vividness of the generated motions by reducing the cross-modal mapping uncertainty.

3D Face Animation regression

3D GAN Inversion with Facial Symmetry Prior

no code implementations CVPR 2023 Fei Yin, Yong Zhang, Xuan Wang, Tengfei Wang, Xiaoyu Li, Yuan Gong, Yanbo Fan, Xiaodong Cun, Ying Shan, Cengiz Oztireli, Yujiu Yang

It is natural to associate 3D GANs with GAN inversion methods to project a real image into the generator's latent space, allowing free-view consistent synthesis and editing, referred as 3D GAN inversion.

Image Reconstruction Neural Rendering

VideoReTalking: Audio-based Lip Synchronization for Talking Head Video Editing In the Wild

1 code implementation27 Nov 2022 Kun Cheng, Xiaodong Cun, Yong Zhang, Menghan Xia, Fei Yin, Mingrui Zhu, Xuan Wang, Jue Wang, Nannan Wang

Our system disentangles this objective into three sequential tasks: (1) face video generation with a canonical expression; (2) audio-driven lip-sync; and (3) face enhancement for improving photo-realism.

Video Editing Video Generation

SadTalker: Learning Realistic 3D Motion Coefficients for Stylized Audio-Driven Single Image Talking Face Animation

1 code implementation CVPR 2023 Wenxuan Zhang, Xiaodong Cun, Xuan Wang, Yong Zhang, Xi Shen, Yu Guo, Ying Shan, Fei Wang

We present SadTalker, which generates 3D motion coefficients (head pose, expression) of the 3DMM from audio and implicitly modulates a novel 3D-aware face render for talking head generation.

Talking Head Generation

StyleHEAT: One-Shot High-Resolution Editable Talking Face Generation via Pre-trained StyleGAN

1 code implementation8 Mar 2022 Fei Yin, Yong Zhang, Xiaodong Cun, Mingdeng Cao, Yanbo Fan, Xuan Wang, Qingyan Bai, Baoyuan Wu, Jue Wang, Yujiu Yang

Our framework elevates the resolution of the synthesized talking face to 1024*1024 for the first time, even though the training dataset has a lower resolution.

Facial Editing Talking Face Generation +1

Spatial-Separated Curve Rendering Network for Efficient and High-Resolution Image Harmonization

1 code implementation13 Sep 2021 Jingtang Liang, Xiaodong Cun, Chi-Man Pun, Jue Wang

To this end, we propose a novel spatial-separated curve rendering network(S$^2$CRNet) for efficient and high-resolution image harmonization for the first time.

Image Harmonization Image-to-Image Translation +2

Split then Refine: Stacked Attention-guided ResUNets for Blind Single Image Visible Watermark Removal

1 code implementation13 Dec 2020 Xiaodong Cun, Chi-Man Pun

Simultaneously, to increase the robustness of watermark, attacking technique, such as watermark removal, also gets the attention from the community.

Defocus Blur Detection via Depth Distillation

1 code implementation ECCV 2020 Xiaodong Cun, Chi-Man Pun

In detail, we learn the defocus blur from ground truth and the depth distilled from a well-trained depth estimation network at the same time.

Depth Estimation Knowledge Distillation

Improving the Harmony of the Composite Image by Spatial-Separated Attention Module

1 code implementation15 Jul 2019 Xiaodong Cun, Chi-Man Pun

Thus, we address the problem of Image Harmonization: Given a spliced image and the mask of the spliced region, we try to harmonize the "style" of the pasted region with the background (non-spliced region).

Image Harmonization

Depth Assisted Full Resolution Network for Single Image-based View Synthesis

no code implementations17 Nov 2017 Xiaodong Cun, Feng Xu, Chi-Man Pun, Hao Gao

In this paper, we focus on a more challenging and ill-posed problem that is to synthesize novel viewpoints from one single input image.

Depth Estimation

Cannot find the paper you are looking for? You can Submit a new open access paper.