Deformable Pose Traversal Convolution for 3D Action and Gesture Recognition

The representation of 3D pose plays a critical role for 3D body action and hand gesture recognition. Rather than directly representing the 3D pose using its joint locations, in this paper, we propose Deformable Pose Traversal Convolution which applies one-dimensional convolution to traverse the 3D pose to represent it. Instead of fixing the reception field when performing traversal convolution, it optimizes the convolutional kernel for each joint, by considering contextual joints with various weights. This deformable convolution can better utilize contextual joints for action and gesture recognition and is more robust to noisy joints. Moreover, by feeding the learned pose feature to a LSTM, we can perform end-to-end training which jointly optimizes 3D pose representation and temporal sequence recognition. Experiments on three benchmark datasets validate the competitive performance of our proposed method, as well as its efficiency and robustness to handle noisy pose.

PDF Abstract

Datasets


  Add Datasets introduced or used in this paper

Results from the Paper


  Submit results from this paper to get state-of-the-art GitHub badges and help the community compare results to other papers.

Methods