Motion Representations for Articulated Animation

We propose novel motion representations for animating articulated objects consisting of distinct parts. In a completely unsupervised manner, our method identifies object parts, tracks them in a driving video, and infers their motions by considering their principal axes. In contrast to the previous keypoint-based works, our method extracts meaningful and consistent regions, describing locations, shape, and pose. The regions correspond to semantically relevant and distinct object parts, that are more easily detected in frames of the driving video. To force decoupling of foreground from background, we model non-object related global motion with an additional affine transformation. To facilitate animation and prevent the leakage of the shape of the driving object, we disentangle shape and pose of objects in the region space. Our model can animate a variety of objects, surpassing previous methods by a large margin on existing benchmarks. We present a challenging new benchmark with high-resolution videos and show that the improvement is particularly pronounced when articulated objects are considered, reaching 96.6% user preference vs. the state of the art.

PDF Abstract CVPR 2021 PDF CVPR 2021 Abstract

Datasets


Introduced in the Paper:

TED-talks

Used in the Paper:

VoxCeleb1 MGif Tai-Chi-HD
Task Dataset Model Metric Name Metric Value Global Rank Result Benchmark
Video Reconstruction MGif FOMM L1 0.0223 # 2
Video Reconstruction MGif Siarohin et al. L1 0.0206 # 1
Video Reconstruction Tai-Chi-HD (256) FOMM L1 0.056 # 2
AED 0.172 # 2
AKD 6.53 # 2
MKR 0.033 # 2
Video Reconstruction Tai-Chi-HD (256) Siarohin et al. L1 0.047 # 1
AED 0.152 # 1
AKD 5.58 # 1
MKR 0.027 # 1
Video Reconstruction Tai-Chi-HD (512) FOMM L1 0.075 # 2
AKD 17.12 # 2
MKR 0.066 # 2
AED 0.203 # 2
Video Reconstruction Tai-Chi-HD (512) Siarohin et al. L1 0.064 # 1
AKD 13.86 # 1
MKR 0.043 # 1
AED 0.172 # 1
Video Reconstruction TED-talks Siarohin et al. L1 0.026 # 1
AKD 3.75 # 2
MKR 0.007 # 1
AED 0.114 # 1
Video Reconstruction TED-talks FOMM L1 0.033 # 2
AKD 7.07 # 1
MKR 0.014 # 2
AED 0.163 # 2
Video Reconstruction VoxCeleb Siarohin et al. L1 0.040 # 1
AKD 1.28 # 2
AED 0.133 # 1
Video Reconstruction VoxCeleb FOMM L1 0.041 # 2
AKD 1.27 # 1
AED 0.134 # 2

Methods


No methods listed for this paper. Add relevant methods here