no code implementations • 2 Aug 2024 • Dong Huo, Zixin Guo, Xinxin Zuo, Zhihao Shi, Juwei Lu, Peng Dai, Songcen Xu, Li Cheng, Yee-Hong Yang
For view consistent sampling, first of all we maintain a texture map in RGB space that is parameterized by the denoising step and updated after each sampling step of the diffusion model to progressively reduce the view discrepancy.
no code implementations • 5 Jul 2024 • Yuxuan Mu, Xinxin Zuo, Chuan Guo, Yilin Wang, Juwei Lu, Xiaofeng Wu, Songcen Xu, Peng Dai, Youliang Yan, Li Cheng
We present GSD, a diffusion model approach based on Gaussian Splatting (GS) representation for 3D object reconstruction from a single view.
no code implementations • 24 Jan 2024 • Chuan Guo, Yuxuan Mu, Xinxin Zuo, Peng Dai, Youliang Yan, Juwei Lu, Li Cheng
Building upon this, we present a novel generative model that produces diverse stylization results of a single motion (latent) code.
1 code implementation • 7 Dec 2023 • Tiantian Wang, Xinxin Zuo, Fangzhou Mu, Jian Wang, Ming-Hsuan Yang
To overcome these limitations, we leverage Neural Radiance Fields (NeRFs) to represent videos, conducting stylization in the rendered feature space.
1 code implementation • ICCV 2023 • Kehong Gong, Dongze Lian, Heng Chang, Chuan Guo, Zihang Jiang, Xinxin Zuo, Michael Bi Mi, Xinchao Wang
We propose a novel task for generating 3D dance movements that simultaneously incorporate both text and music modalities.
Ranked #2 on
Motion Synthesis
on AIST++
1 code implementation • 16 Mar 2023 • Shihao Zou, Yuxuan Mu, Xinxin Zuo, Sen Wang, Li Cheng
Motivated by the above mentioned issues, we present in this paper a dedicated end-to-end sparse deep learning approach for event-based pose tracking: 1) to our knowledge this is the first time that 3D human pose tracking is obtained from events only, thus eliminating the need of accessing to any frame-based images as part of input; 2) our approach is based entirely upon the framework of Spiking Neural Networks (SNNs), which consists of Spike-Element-Wise (SEW) ResNet and a novel Spiking Spatiotemporal Transformer; 3) a large-scale synthetic dataset is constructed that features a broad and diverse set of annotated 3D human motions, as well as longer hours of event stream data, named SynEventHPD.
1 code implementation • 4 Jul 2022 • Chuan Guo, Xinxin Zuo, Sen Wang, Li Cheng
Our approach is flexible, could be used for both text2motion and motion2text tasks.
Ranked #3 on
Motion Captioning
on HumanML3D
1 code implementation • CVPR 2022 • Chuan Guo, Shihao Zou, Xinxin Zuo, Sen Wang, Wei Ji, Xingyu Li, Li Cheng
Automated generation of 3D human motions from text is a challenging problem.
Ranked #3 on
Motion Synthesis
on Inter-X
no code implementations • 26 Nov 2021 • Ji Yang, Youdong Ma, Xinxin Zuo, Sen Wang, Minglun Gong, Li Cheng
This paper considers to jointly tackle the highly correlated tasks of estimating 3D human body poses and predicting future 3D motions from RGB image sequences.
no code implementations • 12 Nov 2021 • Chuan Guo, Xinxin Zuo, Sen Wang, Xinshuang Liu, Shihao Zou, Minglun Gong, Li Cheng
Action2motion stochastically generates plausible 3D pose sequences of a prescribed action category, which are processed and rendered by motion2video to form 2D videos.
1 code implementation • ICCV 2021 • Shihao Zou, Chuan Guo, Xinxin Zuo, Sen Wang, Pengyu Wang, Xiaoqin Hu, Shoushun Chen, Minglun Gong, Li Cheng
Event camera is an emerging imaging sensor for capturing dynamics of moving objects as events, which motivates our work in estimating 3D human pose and shape from the event signals.
1 code implementation • 15 Aug 2021 • Shihao Zou, Xinxin Zuo, Sen Wang, Yiming Qian, Chuan Guo, Li Cheng
This paper focuses on a new problem of estimating human pose and shape from single polarization images.
no code implementations • 6 Aug 2021 • Hao Zhu, Xinxin Zuo, Haotian Yang, Sen Wang, Xun Cao, Ruigang Yang
In this paper, we propose a novel learning-based framework that combines the robustness of the parametric model with the flexibility of free-form 3D deformation.
no code implementations • 5 Aug 2021 • Ji Yang, Xinxin Zuo, Sen Wang, Zhenbo Yu, Xingyu Li, Bingbing Ni, Minglun Gong, Li Cheng
A dataset of generic 3D objects with ground-truth annotated skeletons is collected.
1 code implementation • 15 Jul 2021 • Xinxin Zuo, Sen Wang, Qiang Sun, Minglun Gong, Li Cheng
However, Chamfer distance is quite sensitive to noise and outliers, thus could be unreliable to assign correspondences.
no code implementations • CVPR 2021 • Jin Fang, Xinxin Zuo, Dingfu Zhou, Shengze Jin, Sen Wang, Liangjun Zhang
Finally, we verify the proposed framework on the public KITTI dataset with different 3D object detectors.
1 code implementation • 30 Jul 2020 • Chuan Guo, Xinxin Zuo, Sen Wang, Shihao Zou, Qingyao Sun, Annan Deng, Minglun Gong, Li Cheng
Action recognition is a relatively established task, where givenan input sequence of human motion, the goal is to predict its ac-tion category.
no code implementations • ECCV 2020 • Shihao Zou, Xinxin Zuo, Yiming Qian, Sen Wang, Chi Xu, Minglun Gong, Li Cheng
Inspired by the recent advances in human shape estimation from single color images, in this paper, we attempt at estimating human body shapes by leveraging the geometric cues from single polarization images.
1 code implementation • 17 Jul 2020 • Miao Liao, Sibo Zhang, Peng Wang, Hao Zhu, Xinxin Zuo, Ruigang Yang
In this paper, we propose a novel approach to convert given speech audio to a photo-realistic speaking video of a specific person, where the output video has synchronized, realistic, and expressive rich body dynamics.
no code implementations • 5 Jun 2020 • Xinxin Zuo, Sen Wang, Jiangbin Zheng, Weiwei Yu, Minglun Gong, Ruigang Yang, Li Cheng
First, based on a generative human template, for every two frames having sufficient overlap, an initial pairwise alignment is performed; It is followed by a global non-rigid registration procedure, in which partial results from RGBD frames are collected into a unified 3D shape, under the guidance of correspondences from the pairwise alignment; Finally, the texture map of the reconstructed human model is optimized to deliver a clear and spatially consistent texture.
no code implementations • 30 Apr 2020 • Shihao Zou, Xinxin Zuo, Yiming Qian, Sen Wang, Chuan Guo, Chi Xu, Minglun Gong, Li Cheng
Polarization images are known to be able to capture polarized reflected lights that preserve rich geometric cues of an object, which has motivated its recent applications in reconstructing detailed surface normal of the objects of interest.
1 code implementation • CVPR 2019 • Hao Zhu, Xinxin Zuo, Sen Wang, Xun Cao, Ruigang Yang
This paper presents a novel framework to recover detailed human body shapes from a single image.
no code implementations • ICCV 2017 • Xinxin Zuo, Sen Wang, Jiangbin Zheng, Ruigang Yang
In this paper we present a novel approach for depth map enhancement from an RGB-D video sequence.
no code implementations • ICCV 2015 • Xinxin Zuo, Chao Du, Sen Wang, Jiangbin Zheng, Ruigang Yang
We discovered that these internal contours, which are results of convex parts on an object's surface, can lead to a tighter fit than the original visual hull.