Search Results for author: Chenxu Zhang

Found 11 papers, 6 papers with code

Joint Co-Speech Gesture and Expressive Talking Face Generation using Diffusion with Adapters

1 code implementation18 Dec 2024 Steven Hogue, Chenxu Zhang, Yapeng Tian, Xiaohu Guo

Recent advances in co-speech gesture and talking head generation have been impressive, yet most methods focus on only one of the two tasks.

Talking Face Generation Talking Head Generation

DiffTED: One-shot Audio-driven TED Talk Video Generation with Diffusion-based Co-speech Gestures

no code implementations11 Sep 2024 Steven Hogue, Chenxu Zhang, Hamza Daruger, Yapeng Tian, Xiaohu Guo

Audio-driven talking video generation has advanced significantly, but existing methods often depend on video-to-video translation techniques and traditional generative networks like GANs and they typically generate taking heads and co-speech gestures separately, leading to less coherent outputs.

Diversity Talking Head Generation +1

Vision Mamba: A Comprehensive Survey and Taxonomy

1 code implementation7 May 2024 Xiao Liu, Chenxu Zhang, Lei Zhang

In the field of deep learning, state space models are used to process sequence data, such as time series analysis, natural language processing (NLP) and video understanding.

Mamba Medical Image Analysis +4

Magic-Boost: Boost 3D Generation with Multi-View Conditioned Diffusion

1 code implementation9 Apr 2024 Fan Yang, Jianfeng Zhang, Yichun Shi, Bowen Chen, Chenxu Zhang, Huichao Zhang, Xiaofeng Yang, Xiu Li, Jiashi Feng, Guosheng Lin

In detail, we first propose a novel multi-view conditioned diffusion model which extracts 3d prior from the synthesized multi-view images to synthesize high-fidelity novel view images and then introduce a novel iterative-update strategy to adopt it to provide precise guidance to refine the coarse generated results through a fast optimization process.

3D Generation

Robust Active Speaker Detection in Noisy Environments

no code implementations27 Mar 2024 Siva Sai Nagender Vasireddy, Chenxu Zhang, Xiaohu Guo, Yapeng Tian

Experiments demonstrate that non-speech audio noises significantly impact ASD models, and our proposed approach improves ASD performance in noisy environments.

Active Speaker Detection Speech Separation

Sora Generates Videos with Stunning Geometrical Consistency

no code implementations27 Feb 2024 XuanYi Li, Daquan Zhou, Chenxu Zhang, Shaodong Wei, Qibin Hou, Ming-Ming Cheng

We employ a method that transforms the generated videos into 3D models, leveraging the premise that the accuracy of 3D reconstruction is heavily contingent on the video quality.

3D Reconstruction Video Generation

AvatarStudio: High-fidelity and Animatable 3D Avatar Creation from Text

no code implementations29 Nov 2023 Jianfeng Zhang, Xuanmeng Zhang, Huichao Zhang, Jun Hao Liew, Chenxu Zhang, Yi Yang, Jiashi Feng

We study the problem of creating high-fidelity and animatable 3D avatars from only textual descriptions.

FACIAL: Synthesizing Dynamic Talking Face with Implicit Attribute Learning

1 code implementation ICCV 2021 Chenxu Zhang, Yifan Zhao, Yifei HUANG, Ming Zeng, Saifeng Ni, Madhukar Budagavi, Xiaohu Guo

In this paper, we propose a talking face generation method that takes an audio signal as input and a short target video clip as reference, and synthesizes a photo-realistic video of the target face with natural lip motions, head poses, and eye blinks that are in-sync with the input audio signal.

3D Face Animation Attribute +2

Cannot find the paper you are looking for? You can Submit a new open access paper.