Make Skeleton-based Action Recognition Model Smaller, Faster and Better

Although skeleton-based action recognition has achieved great success in recent years, most of the existing methods may suffer from a large model size and slow execution speed. To alleviate this issue, we analyze skeleton sequence properties to propose a Double-feature Double-motion Network (DD-Net) for skeleton-based action recognition. By using a lightweight network structure (i.e., 0.15 million parameters), DD-Net can reach a super fast speed, as 3,500 FPS on one GPU, or, 2,000 FPS on one CPU. By employing robust features, DD-Net achieves the state-of-the-art performance on our experimental datasets: SHREC (i.e., hand actions) and JHMDB (i.e., body actions). Our code will be released with this paper later.

PDF Abstract


Task Dataset Model Metric Name Metric Value Global Rank Result Benchmark
Skeleton Based Action Recognition J-HMDB DD-Net Accuracy (RGB+pose) - # 11
Accuracy (pose) 77.2 # 1
Skeleton Based Action Recognition JHMDB (2D poses only) DD-Net Accuracy 78.0 (average of 3 split train/test) # 1
Average accuracy of 3 splits 77.2 # 1
No. parameters 1.82 M # 1
Skeleton Based Action Recognition SHREC 2017 track on 3D Hand Gesture Recognition DD-Net Accuracy 94.6 (14 gestures) , 91.9 (28 gestures ) # 1
28 gestures accuracy 91.9 # 1
14 gestures accuracy 94.6 # 1
No. parameters 1.82M # 4
Speed (FPS) 2,200 # 3
Hand Gesture Recognition SHREC 2017 track on 3D Hand Gesture Recognition DD-Net 14 gestures accuracy 94.6 # 3


No methods listed for this paper. Add relevant methods here