An Attention-Enhanced Recurrent Graph Convolutional Network for Skeleton-Based Action Recognition

Dynamic movements of human skeleton have attracted more and more attention as a robust modality for action recognition. As not all temporal stages and skeleton joints are informative for action recognition, and the irrelevant information often brings noise which can degrade the detection performance, extracting discriminative temporal and spatial features becomes an important task. In this paper, we propose a novel end-to-end attention-enhanced recurrent graph convolutional network (AR-GCN) for skeleton-based action recognition. An attention-enhanced mechanism is employed in AR-GCN to pay different levels of attention to different temporal stages and spatial joints. This approach overcomes the information loss caused by only using keyframes and key joints. In particular, AR-GCN combines the graph convolutional network (GCN) with the bidirectional recurrent neural network (BRNN), which retains the irregular joints expressive power of the original GCN, while promoting its sequential modeling ability by introducing a recurrent network. Experimental results demonstrate the effectiveness of our proposed model on the widely used NTU and Kinetics

PDF Abstract
Task Dataset Model Metric Name Metric Value Global Rank Benchmark
Skeleton Based Action Recognition Kinetics-Skeleton dataset AR-GCN Accuracy 33.5 # 26
Skeleton Based Action Recognition NTU RGB+D AR-GCN Accuracy (CV) 93.2 # 68
Accuracy (CS) 85.1 # 79

Methods