PGCN-TCA: Pseudo Graph Convolutional Network With Temporal and Channel-Wise Attention for Skeleton-Based Action Recognition

Skeleton-based human action recognition has become an active research area in recent years. The key to this task is to fully explore both spatial and temporal features. Recently, GCN-based methods modeling the human body skeletons as spatial-temporal graphs, have achieved remarkable performances. However, most GCN-based methods use a fixed adjacency matrix defined by the dataset, which can only capture the structural information provided by joints directly connected through bones and ignore the dependencies between distant joints that are not connected. In addition, such a fixed adjacency matrix used in all layers leads to the network failing to extract multi-level semantic features. In this paper we propose a pseudo graph convolutional network with temporal and channel-wise attention (PGCN-TCA) to solve this problem. The fixed normalized adjacent matrix is substituted with a learnable matrix. In this way, the matrix can learn the dependencies between connected joints and joints that are not physically connected. At the same time, learnable matrices in different layers can help the network capture multi-level features in spatial domain. Moreover, Since frames and input channels that contain outstanding characteristics play significant roles in distinguishing the action from others, we propose a mixed temporal and channel-wise attention. Our method achieves comparable performances to state-of-the-art methods on NTU-RGB+D and HDM05 datasets.

PDF Abstract

Datasets


Task Dataset Model Metric Name Metric Value Global Rank Benchmark
Skeleton Based Action Recognition NTU RGB+D PGCN-TCA Accuracy (CV) 93.6 # 61
Accuracy (CS) 88.0 # 58

Methods


No methods listed for this paper. Add relevant methods here