Chained Multi-stream Networks Exploiting Pose, Motion, and Appearance for Action Classification and Detection

ICCV 2017 Mohammadreza ZolfaghariGabriel L. OliveiraNima SedaghatThomas Brox

General human action recognition requires understanding of various visual cues. In this paper, we propose a network architecture that computes and integrates the most important visual cues for action recognition: pose, motion, and the raw images... (read more)

PDF Abstract
TASK DATASET MODEL METRIC NAME METRIC VALUE GLOBAL RANK RESULT LEADERBOARD
Skeleton Based Action Recognition J-HMDB Chained (RGB+Flow +Pose) Accuracy (RGB+pose) 76.1 # 6
Accuracy (pose) 56.8 # 4
Skeleton Based Action Recognition JHMDB (2D poses only) Chained Average accuracy of 3 splits 56.8 # 4