In this work, we propose HumanLiff, the first layer-wise 3D human generative model with a unified diffusion process.
Synthetic data has emerged as a promising source for 3D human research as it offers low-cost access to large-scale human datasets.
As it is hard to calibrate single-view RGB images in the wild, existing 3D human mesh reconstruction (3DHMR) methods either use a constant large focal length or estimate one based on the background environment context, which can not tackle the problem of the torso, limb, hand or face distortion caused by perspective camera projection when the camera is close to the human body.
Ranked #2 on 3D Human Pose Estimation on 3DPW
To this end, we propose a bank of 3D-aware hierarchical features, including global, point-level, and pixel-aligned features, to facilitate informative encoding.