no code implementations • 1 Jun 2023 • Jinbo Xing, Menghan Xia, Yuxin Liu, Yuechen Zhang, Yong Zhang, Yingqing He, Hanyuan Liu, Haoxin Chen, Xiaodong Cun, Xintao Wang, Ying Shan, Tien-Tsin Wong
Our method, dubbed Make-Your-Video, involves joint-conditional video generation using a Latent Diffusion Model that is pre-trained for still image synthesis and then promoted for video generation with the introduction of temporal modules.
1 code implementation • 1 Jun 2023 • Ge Yuan, Xiaodong Cun, Yong Zhang, Maomao Li, Chenyang Qi, Xintao Wang, Ying Shan, Huicheng Zheng
Empowered by the proposed celeb basis, the new identity in our customized model showcases a better concept combination ability than previous personalization methods.
1 code implementation • 29 May 2023 • Weihuang Liu, Xi Shen, Chi-Man Pun, Xiaodong Cun
We take inspiration from the widely-used pre-training and then prompt tuning protocols in NLP and propose a new visual prompting model, named Explicit Visual Prompting (EVP).
1 code implementation • 29 May 2023 • Yuan Gong, Youxin Pang, Xiaodong Cun, Menghan Xia, Yingqing He, Haoxin Chen, Longyue Wang, Yong Zhang, Xintao Wang, Ying Shan, Yujiu Yang
Accurate Story visualization requires several necessary elements, such as identity consistency across frames, the alignment between plain text and visual content, and a reasonable layout of objects in images.
1 code implementation • 3 Apr 2023 • Yue Ma, Yingqing He, Xiaodong Cun, Xintao Wang, Ying Shan, Xiu Li, Qifeng Chen
Generating text-editable and pose-controllable character videos have an imperious demand in creating various digital human.
1 code implementation • CVPR 2023 • Weihuang Liu, Xi Shen, Chi-Man Pun, Xiaodong Cun
Different from the previous visual prompting which is typically a dataset-level implicit embedding, our key insight is to enforce the tunable parameters focusing on the explicit visual content from each individual image, i. e., the features from frozen patch embeddings and the input's high-frequency components.
1 code implementation • 16 Mar 2023 • Chenyang Qi, Xiaodong Cun, Yong Zhang, Chenyang Lei, Xintao Wang, Ying Shan, Qifeng Chen
We also have a better zero-shot shape-aware editing ability based on the text-to-video model.
2 code implementations • 15 Mar 2023 • Weihuang Liu, Xiaodong Cun, Chi-Man Pun, Menghan Xia, Yong Zhang, Jue Wang
Thanks to the proposed structure, we only encode the high-resolution image in a relatively low resolution for larger reception field capturing.
1 code implementation • CVPR 2023 • Youxin Pang, Yong Zhang, Weize Quan, Yanbo Fan, Xiaodong Cun, Ying Shan, Dong-Ming Yan
In this paper, we introduce a novel self-supervised disentanglement framework to decouple pose and expression without 3DMMs and paired data, which consists of a motion editing module, a pose generator, and an expression generator.
1 code implementation • 15 Jan 2023 • Jianrong Zhang, Yangsong Zhang, Xiaodong Cun, Shaoli Huang, Yong Zhang, Hongwei Zhao, Hongtao Lu, Xi Shen
Additionally, we conduct analyses on HumanML3D and observe that the dataset size is a limitation of our approach.
Ranked #1 on
Motion Synthesis
on HumanML3D
1 code implementation • CVPR 2023 • Jinbo Xing, Menghan Xia, Yuechen Zhang, Xiaodong Cun, Jue Wang, Tien-Tsin Wong
In this paper, we propose to cast speech-driven facial animation as a code query task in a finite proxy space of the learned codebook, which effectively promotes the vividness of the generated motions by reducing the cross-modal mapping uncertainty.
no code implementations • CVPR 2023 • Jianrong Zhang, Yangsong Zhang, Xiaodong Cun, Yong Zhang, Hongwei Zhao, Hongtao Lu, Xi Shen, Ying Shan
Additionally, we conduct analyses on HumanML3D and observe that the dataset size is a limitation of our approach.
no code implementations • CVPR 2023 • Fei Yin, Yong Zhang, Xuan Wang, Tengfei Wang, Xiaoyu Li, Yuan Gong, Yanbo Fan, Xiaodong Cun, Ying Shan, Cengiz Oztireli, Yujiu Yang
It is natural to associate 3D GANs with GAN inversion methods to project a real image into the generator's latent space, allowing free-view consistent synthesis and editing, referred as 3D GAN inversion.
1 code implementation • 30 Nov 2022 • Xuhang Chen, Xiaodong Cun, Chi-Man Pun, Shuqiang Wang
Shadow removal improves the visual quality and legibility of digital copies of documents.
1 code implementation • 27 Nov 2022 • Kun Cheng, Xiaodong Cun, Yong Zhang, Menghan Xia, Fei Yin, Mingrui Zhu, Xuan Wang, Jue Wang, Nannan Wang
Our system disentangles this objective into three sequential tasks: (1) face video generation with a canonical expression; (2) audio-driven lip-sync; and (3) face enhancement for improving photo-realism.
1 code implementation • CVPR 2023 • Wenxuan Zhang, Xiaodong Cun, Xuan Wang, Yong Zhang, Xi Shen, Yu Guo, Ying Shan, Fei Wang
We present SadTalker, which generates 3D motion coefficients (head pose, expression) of the 3DMM from audio and implicitly modulates a novel 3D-aware face render for talking head generation.
no code implementations • 21 Mar 2022 • Xiaodong Cun, Zhendong Wang, Chi-Man Pun, Jianzhuang Liu, Wengang Zhou, Xu Jia, Houqiang Li
Color constancy aims to restore the constant colors of a scene under different illuminants.
1 code implementation • 8 Mar 2022 • Fei Yin, Yong Zhang, Xiaodong Cun, Mingdeng Cao, Yanbo Fan, Xuan Wang, Qingyan Bai, Baoyuan Wu, Jue Wang, Yujiu Yang
Our framework elevates the resolution of the synthesized talking face to 1024*1024 for the first time, even though the training dataset has a lower resolution.
1 code implementation • 13 Sep 2021 • Jingtang Liang, Xiaodong Cun, Chi-Man Pun, Jue Wang
To this end, we propose a novel spatial-separated curve rendering network(S$^2$CRNet) for efficient and high-resolution image harmonization for the first time.
Ranked #7 on
Image Harmonization
on iHarmony4
4 code implementations • CVPR 2022 • Zhendong Wang, Xiaodong Cun, Jianmin Bao, Wengang Zhou, Jianzhuang Liu, Houqiang Li
Powered by these two designs, Uformer enjoys a high capability for capturing both local and global dependencies for image restoration.
Ranked #1 on
Deblurring
on RealBlur-R (trained on GoPro)
1 code implementation • 13 Dec 2020 • Xiaodong Cun, Chi-Man Pun
Simultaneously, to increase the robustness of watermark, attacking technique, such as watermark removal, also gets the attention from the community.
1 code implementation • ECCV 2020 • Xiaodong Cun, Chi-Man Pun
In detail, we learn the defocus blur from ground truth and the depth distilled from a well-trained depth estimation network at the same time.
4 code implementations • 20 Nov 2019 • Xiaodong Cun, Chi-Man Pun, Cheng Shi
With the help of novel masks or scenes, we enhance the current datasets using synthesized shadow images.
Ranked #2 on
Shadow Removal
on ISTD
1 code implementation • 15 Jul 2019 • Xiaodong Cun, Chi-Man Pun
Thus, we address the problem of Image Harmonization: Given a spliced image and the mask of the spliced region, we try to harmonize the "style" of the pasted region with the background (non-spliced region).
Ranked #4 on
Image Harmonization
on HAdobe5k(1024$\times$1024)
no code implementations • 17 Nov 2017 • Xiaodong Cun, Feng Xu, Chi-Man Pun, Hao Gao
In this paper, we focus on a more challenging and ill-posed problem that is to synthesize novel viewpoints from one single input image.