ArtiBoost: Boosting Articulated 3D Hand-Object Pose Estimation via Online Exploration and Synthesis

Estimating the articulated 3D hand-object pose from a single RGB image is a highly ambiguous and challenging problem, requiring large-scale datasets that contain diverse hand poses, object types, and camera viewpoints. Most real-world datasets lack these diversities. In contrast, data synthesis can easily ensure those diversities separately. However, constructing both valid and diverse hand-object interactions and efficiently learning from the vast synthetic data is still challenging. To address the above issues, we propose ArtiBoost, a lightweight online data enhancement method. ArtiBoost can cover diverse hand-object poses and camera viewpoints through sampling in a Composited hand-object Configuration and Viewpoint space (CCV-space) and can adaptively enrich the current hard-discernable items by loss-feedback and sample re-weighting. ArtiBoost alternatively performs data exploration and synthesis within a learning pipeline, and those synthetic data are blended into real-world source data for training. We apply ArtiBoost on a simple learning baseline network and witness the performance boost on several hand-object benchmarks. Our models and code are available at https://github.com/lixiny/ArtiBoost.

PDF Abstract CVPR 2022 PDF CVPR 2022 Abstract

Datasets


Results from the Paper


Ranked #3 on hand-object pose on HO-3D (using extra training data)

     Get a GitHub badge
Task Dataset Model Metric Name Metric Value Global Rank Uses Extra
Training Data
Benchmark
hand-object pose DexYCB ArtiBoost Average MPJPE (mm) 12.8 # 4
Procrustes-Aligned MPJPE - # 4
OCE - # 6
MCE - # 5
ADD-S - # 4
hand-object pose HO-3D ArtiBoost Average MPJPE (mm) 26.3 # 4
ST-MPJPE 25.3 # 3
PA-MPJPE 11.4 # 7
OME - # 7
ADD-S - # 7
3D Hand Pose Estimation HO-3D ArtiBoost Average MPJPE (mm) 26.3 # 6
ST-MPJPE (mm) 25.3 # 8
PA-MPJPE (mm) 11.4 # 12

Methods