Diverse Human Motion Prediction Guided by Multi-Level Spatial-Temporal Anchors

9 Feb 2023  ยท  Sirui Xu, Yu-Xiong Wang, Liang-Yan Gui ยท

Predicting diverse human motions given a sequence of historical poses has received increasing attention. Despite rapid progress, existing work captures the multi-modal nature of human motions primarily through likelihood-based sampling, where the mode collapse has been widely observed. In this paper, we propose a simple yet effective approach that disentangles randomly sampled codes with a deterministic learnable component named anchors to promote sample precision and diversity. Anchors are further factorized into spatial anchors and temporal anchors, which provide attractively interpretable control over spatial-temporal disparity. In principle, our spatial-temporal anchor-based sampling (STARS) can be applied to different motion predictors. Here we propose an interaction-enhanced spatial-temporal graph convolutional network (IE-STGCN) that encodes prior knowledge of human motions (e.g., spatial locality), and incorporate the anchors into it. Extensive experiments demonstrate that our approach outperforms state of the art in both stochastic and deterministic prediction, suggesting it as a unified framework for modeling human motions. Our code and pretrained models are available at https://github.com/Sirui-Xu/STARS.

PDF Abstract

Datasets


Results from the Paper


Task Dataset Model Metric Name Metric Value Global Rank Result Benchmark
Human Pose Forecasting Human3.6M STARS APD 15884 # 2
ADE 358 # 2
FDE 445 # 2
MMADE 442 # 1
MMFDE 471 # 2
Human Pose Forecasting HumanEva-I STARS APD@2000ms 6031 # 3
ADE@2000ms 217 # 2
FDE@2000ms 241 # 3
MMADE@2000ms 328 # 1
MMFDE@2000ms 321 # 2

Methods


No methods listed for this paper. Add relevant methods here