DeepActsNet: Spatial and Motion features from Face, Hands, and Body Combined with Convolutional and Graph Networks for Improved Action Recognition

21 Sep 2020  ·  Umar Asif, Deval Mehta, Stefan von Cavallar, Jianbin Tang, Stefan Harrer ·

Existing action recognition methods mainly focus on joint and bone information in human body skeleton data due to its robustness to complex backgrounds and dynamic characteristics of the environments. In this paper, we combine body skeleton data with spatial and motion features from face and two hands, and present "Deep Action Stamps (DeepActs)", a novel data representation to encode actions from video sequences. We also present "DeepActsNet", a deep learning based ensemble model which learns convolutional and structural features from Deep Action Stamps for highly accurate action recognition. Experiments on three challenging action recognition datasets (NTU60, NTU120, and SYSU) show that the proposed model trained using Deep Action Stamps produce considerable improvements in the action recognition accuracy with less computational cost compared to the state-of-the-art methods.

PDF Abstract
No code implementations yet. Submit your code now


  Add Datasets introduced or used in this paper

Results from the Paper

  Submit results from this paper to get state-of-the-art GitHub badges and help the community compare results to other papers.


No methods listed for this paper. Add relevant methods here