Spatio-Temporal Relation and Attention Learning for Facial Action Unit Detection

5 Jan 2020  ·  Zhiwen Shao, Lixin Zou, Jianfei Cai, Yunsheng Wu, Lizhuang Ma ·

Spatio-temporal relations among facial action units (AUs) convey significant information for AU detection yet have not been thoroughly exploited. The main reasons are the limited capability of current AU detection works in simultaneously learning spatial and temporal relations, and the lack of precise localization information for AU feature learning. To tackle these limitations, we propose a novel spatio-temporal relation and attention learning framework for AU detection. Specifically, we introduce a spatio-temporal graph convolutional network to capture both spatial and temporal relations from dynamic AUs, in which the AU relations are formulated as a spatio-temporal graph with adaptively learned instead of predefined edge weights. Moreover, the learning of spatio-temporal relations among AUs requires individual AU features. Considering the dynamism and shape irregularity of AUs, we propose an attention regularization method to adaptively learn regional attentions that capture highly relevant regions and suppress irrelevant regions so as to extract a complete feature for each AU. Extensive experiments show that our approach achieves substantial improvements over the state-of-the-art AU detection methods on BP4D and especially DISFA benchmarks.

PDF Abstract


Results from the Paper

  Submit results from this paper to get state-of-the-art GitHub badges and help the community compare results to other papers.


No methods listed for this paper. Add relevant methods here