Collaborative Regression of Expressive Bodies using Moderation

11 May 2021  ·  Yao Feng, Vasileios Choutas, Timo Bolkart, Dimitrios Tzionas, Michael J. Black ·

Recovering expressive humans from images is essential for understanding human behavior. Methods that estimate 3D bodies, faces, or hands have progressed significantly, yet separately. Face methods recover accurate 3D shape and geometric details, but need a tight crop and struggle with extreme views and low resolution. Whole-body methods are robust to a wide range of poses and resolutions, but provide only a rough 3D face shape without details like wrinkles. To get the best of both worlds, we introduce PIXIE, which produces animatable, whole-body 3D avatars with realistic facial detail, from a single image. For this, PIXIE uses two key observations. First, existing work combines independent estimates from body, face, and hand experts, by trusting them equally. PIXIE introduces a novel moderator that merges the features of the experts, weighted by their confidence. All part experts can contribute to the whole, using SMPL-X's shared shape space across all body parts. Second, human shape is highly correlated with gender, but existing work ignores this. We label training images as male, female, or non-binary, and train PIXIE to infer "gendered" 3D body shapes with a novel shape loss. In addition to 3D body pose and shape parameters, PIXIE estimates expression, illumination, albedo and 3D facial surface displacements. Quantitative and qualitative evaluation shows that PIXIE estimates more accurate whole-body shape and detailed face shape than the state of the art. Models and code are available at https://pixie.is.tue.mpg.de.

PDF Abstract
Task Dataset Model Metric Name Metric Value Global Rank Result Benchmark
3D Multi-Person Mesh Recovery AGORA PIXIE FB-NMVE 233.9 # 1
B-NMVE 173.4 # 1
FB-NMJE 230.9 # 1
B-NMJE 171.1 # 1
FB-MVE 191.8 # 1
B-MVE 142.2 # 1
F-MVE 50.2 # 2
LH/RH-MVE 49.5/49.0 # 1
FB-MPJPE 189.3 # 1
B-MPJPE 140.3 # 1
F-MPJPE 54.5 # 2
LH/RH-MPJPE 46.4/46.0 # 1
3D Human Reconstruction Expressive hands and faces dataset (EHF) PIXIE PA V2V (mm), whole body 55 # 2
PA V2V (mm), body only 53 # 3
PA V2V (mm), left hand 11.2 # 4
PA V2V (mm), face 4.6 # 1
TR V2V (mm), whole body 67.6 # 2
TR V2V (mm), body only 75.8 # 1
TR V2V (mm), left hand 25.6 # 2
TR V2V (mm), face 14.2 # 2
MPJPE-14 61.5 # 1
MPJPE, left hand 11.7 # 1
mean P2S 29.9 # 2
median P2S 18.4 # 2
3D Hand Pose Estimation FreiHAND PIXIE hand expert PA-MPVPE 12.1 # 9
PA-MPJPE 12 # 9
PA-F@5mm 46.8 # 9
PA-F@15mm 91.9 # 9
3D Face Reconstruction NoW Benchmark PIXIE Mean Reconstruction Error (mm) 1.49 # 5
Stdev Reconstruction Error (mm) 1.25 # 5
Median Reconstruction Error 1.18 # 5

Methods


No methods listed for this paper. Add relevant methods here