2D/3D Pose Estimation and Action Recognition using Multitask Deep Learning

CVPR 2018  ·  Diogo C. Luvizon, David Picard, Hedi Tabia ·

Action recognition and human pose estimation are closely related but both problems are generally handled as distinct tasks in the literature. In this work, we propose a multitask framework for jointly 2D and 3D pose estimation from still images and human action recognition from video sequences. We show that a single architecture can be used to solve the two problems in an efficient way and still achieves state-of-the-art results. Additionally, we demonstrate that optimization from end-to-end leads to significantly higher accuracy than separated learning. The proposed architecture can be trained with data from different categories simultaneously in a seamlessly way. The reported results on four datasets (MPII, Human3.6M, Penn Action and NTU) demonstrate the effectiveness of our method on the targeted tasks.

PDF Abstract CVPR 2018 PDF CVPR 2018 Abstract
Task Dataset Model Metric Name Metric Value Global Rank Result Benchmark
3D Human Pose Estimation Human3.6M 2D-3D-Softargmax (multi-crop + h.flip) Average MPJPE (mm) 53.2 # 165
Action Recognition In Videos NTU RGB+D 2D-3D-Softargmax (RGB only) Accuracy (CS) 85.5 # 1

Methods


No methods listed for this paper. Add relevant methods here