NR-DFERNet: Noise-Robust Network for Dynamic Facial Expression Recognition

10 Jun 2022  ·  Hanting Li, Mingzhe Sui, Zhaoqing Zhu, Feng Zhao ·

Dynamic facial expression recognition (DFER) in the wild is an extremely challenging task, due to a large number of noisy frames in the video sequences. Previous works focus on extracting more discriminative features, but ignore distinguishing the key frames from the noisy frames. To tackle this problem, we propose a noise-robust dynamic facial expression recognition network (NR-DFERNet), which can effectively reduce the interference of noisy frames on the DFER task. Specifically, at the spatial stage, we devise a dynamic-static fusion module (DSF) that introduces dynamic features to static features for learning more discriminative spatial features. To suppress the impact of target irrelevant frames, we introduce a novel dynamic class token (DCT) for the transformer at the temporal stage. Moreover, we design a snippet-based filter (SF) at the decision stage to reduce the effect of too many neutral frames on non-neutral sequence classification. Extensive experimental results demonstrate that our NR-DFERNet outperforms the state-of-the-art methods on both the DFEW and AFEW benchmarks.

PDF Abstract

Datasets


Task Dataset Model Metric Name Metric Value Global Rank Result Benchmark
Dynamic Facial Expression Recognition DFEW NR-DFERNet WAR 68.19 # 11
UAR 54.21 # 13
Dynamic Facial Expression Recognition FERV39k NR-DFERNet WAR 45.97 # 9
UAR 33.99 # 9

Methods


No methods listed for this paper. Add relevant methods here