Unifying Graph Embedding Features with Graph Convolutional Networks for Skeleton-based Action Recognition

6 Mar 2020  ·  Dong Yang, Monica Mengqi Li, Hong Fu, Jicong Fan, Zhao Zhang, Howard Leung ·

Combining skeleton structure with graph convolutional networks has achieved remarkable performance in human action recognition. Since current research focuses on designing basic graph for representing skeleton data, these embedding features contain basic topological information, which cannot learn more systematic perspectives from skeleton data. In this paper, we overcome this limitation by proposing a novel framework, which unifies 15 graph embedding features into the graph convolutional network for human action recognition, aiming to best take advantage of graph information to distinguish key joints, bones, and body parts in human action, instead of being exclusive to a single feature or domain. Additionally, we fully investigate how to find the best graph features of skeleton structure for improving human action recognition. Besides, the topological information of the skeleton sequence is explored to further enhance the performance in a multi-stream framework. Moreover, the unified graph features are extracted by the adaptive methods on the training process, which further yields improvements. Our model is validated by three large-scale datasets, namely NTU-RGB+D, Kinetics and SYSU-3D, and outperforms the state-of-the-art methods. Overall, our work unified graph embedding features to promotes systematic research on human action recognition.

PDF Abstract
Task Dataset Model Metric Name Metric Value Global Rank Result Benchmark
Skeleton Based Action Recognition Kinetics-Skeleton dataset CGCN Accuracy 37.5 # 12
Skeleton Based Action Recognition NTU RGB+D CGCN Accuracy (CV) 96.4 # 26
Accuracy (CS) 90.3 # 36

Methods