GFPose: Learning 3D Human Pose Prior with Gradient Fields

Learning 3D human pose prior is essential to human-centered AI. Here, we present GFPose, a versatile framework to model plausible 3D human poses for various applications. At the core of GFPose is a time-dependent score network, which estimates the gradient on each body joint and progressively denoises the perturbed 3D human pose to match a given task specification. During the denoising process, GFPose implicitly incorporates pose priors in gradients and unifies various discriminative and generative tasks in an elegant framework. Despite the simplicity, GFPose demonstrates great potential in several downstream tasks. Our experiments empirically show that 1) as a multi-hypothesis pose estimator, GFPose outperforms existing SOTAs by 20% on Human3.6M dataset. 2) as a single-hypothesis pose estimator, GFPose achieves comparable results to deterministic SOTAs, even with a vanilla backbone. 3) GFPose is able to produce diverse and realistic samples in pose denoising, completion and generation tasks. Project page https://sites.google.com/view/gfpose/

PDF Abstract CVPR 2023 PDF CVPR 2023 Abstract
Task Dataset Model Metric Name Metric Value Global Rank Result Benchmark
Multi-Hypotheses 3D Human Pose Estimation Human3.6M GFPose (HPJ2D-010, S=200) Average MPJPE (mm) 35.1 # 1
Multi-Hypotheses 3D Human Pose Estimation Human3.6M GFPose (HPJ2D-000, S=200) Average MPJPE (mm) 35.6 # 3
Average PMPJPE (mm) 30.5 # 1
Using 2D ground-truth joints 16.9 # 1
Multi-Hypotheses 3D Human Pose Estimation MPI-INF-3DHP GFPose (HPJ2D-000, S=200) PCK 86.9 # 1

Methods


No methods listed for this paper. Add relevant methods here