1 code implementation • 3 Sep 2024 • Wangbo Yu, Jinbo Xing, Li Yuan, WenBo Hu, Xiaoyu Li, Zhipeng Huang, Xiangjun Gao, Tien-Tsin Wong, Ying Shan, Yonghong Tian
Our method takes advantage of the powerful generation capabilities of video diffusion model and the coarse 3D clues offered by point-based representation to generate high-quality video frames with precise camera pose control.
no code implementations • 28 May 2024 • Jinbo Xing, Hanyuan Liu, Menghan Xia, Yong Zhang, Xintao Wang, Ying Shan, Tien-Tsin Wong
We introduce ToonCrafter, a novel approach that transcends traditional correspondence-based cartoon video interpolation, paving the way for generative interpolation.
1 code implementation • 5 Dec 2023 • Yue Ma, Xiaodong Cun, Sen Liang, Jinbo Xing, Yingqing He, Chenyang Qi, Siran Chen, Qifeng Chen
Yet succinct, our method is the first method to show the ability of video property editing from the pre-trained text-to-image model.
2 code implementations • 1 Dec 2023 • Gongye Liu, Menghan Xia, Yong Zhang, Haoxin Chen, Jinbo Xing, Yibo Wang, Xintao Wang, Yujiu Yang, Ying Shan
To address these challenges, we introduce StyleCrafter, a generic method that enhances pre-trained T2V models with a style control adapter, enabling video generation in any style by providing a reference image.
3 code implementations • 30 Oct 2023 • Haoxin Chen, Menghan Xia, Yingqing He, Yong Zhang, Xiaodong Cun, Shaoshu Yang, Jinbo Xing, Yaofang Liu, Qifeng Chen, Xintao Wang, Chao Weng, Ying Shan
The I2V model is designed to produce videos that strictly adhere to the content of the provided reference image, preserving its content, structure, and style.
Ranked #3 on
Text-to-Video Generation
on EvalCrafter Text-to-Video (ECTV) Dataset
(using extra training data)
1 code implementation • 18 Oct 2023 • Jinbo Xing, Menghan Xia, Yong Zhang, Haoxin Chen, Wangbo Yu, Hanyuan Liu, Xintao Wang, Tien-Tsin Wong, Ying Shan
Animating a still image offers an engaging visual experience.
1 code implementation • 13 Jul 2023 • Yingqing He, Menghan Xia, Haoxin Chen, Xiaodong Cun, Yuan Gong, Jinbo Xing, Yong Zhang, Xintao Wang, Chao Weng, Ying Shan, Qifeng Chen
For the first module, we leverage an off-the-shelf video retrieval system and extract video depths as motion structure.
no code implementations • 2 Jun 2023 • Hanyuan Liu, Minshan Xie, Jinbo Xing, Chengze Li, Tien-Tsin Wong
In this paper, we present ColorDiffuser, an adaptation of a pre-trained text-to-image latent diffusion model for video colorization.
no code implementations • 1 Jun 2023 • Jinbo Xing, Menghan Xia, Yuxin Liu, Yuechen Zhang, Yong Zhang, Yingqing He, Hanyuan Liu, Haoxin Chen, Xiaodong Cun, Xintao Wang, Ying Shan, Tien-Tsin Wong
Our method, dubbed Make-Your-Video, involves joint-conditional video generation using a Latent Diffusion Model that is pre-trained for still image synthesis and then promoted for video generation with the introduction of temporal modules.
2 code implementations • NeurIPS 2023 • Yuechen Zhang, Jinbo Xing, Eric Lo, Jiaya Jia
Our pipeline enhances the generation quality of image variations by aligning the image generation process to the source image's inversion chain.
1 code implementation • 21 Apr 2023 • Hanyuan Liu, Jinbo Xing, Minshan Xie, Chengze Li, Tien-Tsin Wong
Our key idea is to exploit the color prior knowledge in the pre-trained T2I diffusion model for realistic and diverse colorization.
1 code implementation • CVPR 2023 • Jinbo Xing, Menghan Xia, Yuechen Zhang, Xiaodong Cun, Jue Wang, Tien-Tsin Wong
In this paper, we propose to cast speech-driven facial animation as a code query task in a finite proxy space of the learned codebook, which effectively promotes the vividness of the generated motions by reducing the cross-modal mapping uncertainty.
Ranked #4 on
3D Face Animation
on BEAT2
1 code implementation • CVPR 2023 • Yuechen Zhang, Zexin He, Jinbo Xing, Xufeng Yao, Jiaya Jia
We propose a ray registration process based on the stylized reference view to obtain pseudo-ray supervision in novel views.
no code implementations • 29 Jan 2022 • Jinbo Xing, WenBo Hu, Tien-Tsin Wong
In this paper, we propose a scale-Arbitrary Invertible image Downscaling Network (AIDN), to natively downscale HR images with arbitrary scale factors.
1 code implementation • 15 Jan 2021 • Mudit Chaudhary, Borislav Dzodzo, Sida Huang, Chun Hei Lo, Mingzhi Lyu, Lun Yiu Nie, Jinbo Xing, Tianhua Zhang, Xiaoying Zhang, Jingyan Zhou, Hong Cheng, Wai Lam, Helen Meng
Dialog systems enriched with external knowledge can handle user queries that are outside the scope of the supporting databases/APIs.