MT-VAE: Learning Motion Transformations to Generate Multimodal Human Dynamics

Long-term human motion can be represented as a series of motion modes---motion sequences that capture short-term temporal dynamics---with transitions between them. We leverage this structure and present a novel Motion Transformation Variational Auto-Encoders (MT-VAE) for learning motion sequence generation. Our model jointly learns a feature embedding for motion modes (that the motion sequence can be reconstructed from) and a feature transformation that represents the transition of one motion mode to the next motion mode. Our model is able to generate multiple diverse and plausible motion sequences in the future from the same input. We apply our approach to both facial and full body motion, and demonstrate applications like analogy-based motion transfer and video synthesis.

PDF Abstract ECCV 2018 PDF ECCV 2018 Abstract

Datasets


Results from the Paper


Results from Other Papers


Task Dataset Model Metric Name Metric Value Rank Source Paper Compare
Human Pose Forecasting Human3.6M MT-VAE APD 403 # 12
ADE 457 # 7
FDE 595 # 10
MMADE 716 # 10
MMFDE 883 # 11
Human Pose Forecasting HumanEva-I MT-VAE APD@2000ms 21 # 10
ADE@2000ms 345 # 9
FDE@2000ms 403 # 9

Methods


No methods listed for this paper. Add relevant methods here