1 code implementation • 18 Dec 2024 • Steven Hogue, Chenxu Zhang, Yapeng Tian, Xiaohu Guo
Recent advances in co-speech gesture and talking head generation have been impressive, yet most methods focus on only one of the two tasks.
no code implementations • 11 Sep 2024 • Steven Hogue, Chenxu Zhang, Hamza Daruger, Yapeng Tian, Xiaohu Guo
Audio-driven talking video generation has advanced significantly, but existing methods often depend on video-to-video translation techniques and traditional generative networks like GANs and they typically generate taking heads and co-speech gestures separately, leading to less coherent outputs.
1 code implementation • 11 Jul 2024 • Suqi Song, Chenxu Zhang, Peng Zhang, Pengkun Li, Fenglong Song, Lei Zhang
Urban waterlogging poses a major risk to public safety and infrastructure.
1 code implementation • 7 May 2024 • Xiao Liu, Chenxu Zhang, Lei Zhang
In the field of deep learning, state space models are used to process sequence data, such as time series analysis, natural language processing (NLP) and video understanding.
1 code implementation • 9 Apr 2024 • Fan Yang, Jianfeng Zhang, Yichun Shi, Bowen Chen, Chenxu Zhang, Huichao Zhang, Xiaofeng Yang, Xiu Li, Jiashi Feng, Guosheng Lin
In detail, we first propose a novel multi-view conditioned diffusion model which extracts 3d prior from the synthesized multi-view images to synthesize high-fidelity novel view images and then introduce a novel iterative-update strategy to adopt it to provide precise guidance to refine the coarse generated results through a fast optimization process.
no code implementations • 27 Mar 2024 • Siva Sai Nagender Vasireddy, Chenxu Zhang, Xiaohu Guo, Yapeng Tian
Experiments demonstrate that non-speech audio noises significantly impact ASD models, and our proposed approach improves ASD performance in noisy environments.
no code implementations • 27 Feb 2024 • XuanYi Li, Daquan Zhou, Chenxu Zhang, Shaodong Wei, Qibin Hou, Ming-Ming Cheng
We employ a method that transforms the generated videos into 3D models, leveraging the premise that the accuracy of 3D reconstruction is heavily contingent on the video quality.
no code implementations • 21 Dec 2023 • Chenxu Zhang, Chao Wang, Jianfeng Zhang, Hongyi Xu, Guoxian Song, You Xie, Linjie Luo, Yapeng Tian, Xiaohu Guo, Jiashi Feng
The generation of emotional talking faces from a single portrait image remains a significant challenge.
no code implementations • 29 Nov 2023 • Jianfeng Zhang, Xuanmeng Zhang, Huichao Zhang, Jun Hao Liew, Chenxu Zhang, Yi Yang, Jiashi Feng
We study the problem of creating high-fidelity and animatable 3D avatars from only textual descriptions.
3 code implementations • CVPR 2024 • Zhongcong Xu, Jianfeng Zhang, Jun Hao Liew, Hanshu Yan, Jia-Wei Liu, Chenxu Zhang, Jiashi Feng, Mike Zheng Shou
Existing animation works typically employ the frame-warping technique to animate the reference image towards the target motion.
1 code implementation • ICCV 2021 • Chenxu Zhang, Yifan Zhao, Yifei HUANG, Ming Zeng, Saifeng Ni, Madhukar Budagavi, Xiaohu Guo
In this paper, we propose a talking face generation method that takes an audio signal as input and a short target video clip as reference, and synthesizes a photo-realistic video of the target face with natural lip motions, head poses, and eye blinks that are in-sync with the input audio signal.