A Dual-Direction Attention Mixed Feature Network for Facial Expression Recognition

In recent years, facial expression recognition (FER) has garnered significant attention within the realm of computer vision research. This paper presents an innovative network called the Dual-Direction Attention Mixed Feature Network (DDAMFN) specifically designed for FER, boasting both robustness and lightweight characteristics. The network architecture comprises two primary components: the Mixed Feature Network (MFN) serving as the backbone, and the Dual-Direction Attention Network (DDAN) functioning as the head. To enhance the network’s capability in the MFN, resilient features are extracted by utilizing mixed-size kernels. Additionally, a new Dual-Direction Attention (DDA) head that generates attention maps in two orientations is proposed, enabling the model to capture long-range dependencies effectively. To further improve the accuracy, a novel attention loss mechanism for the DDAN is introduced with different heads focusing on distinct areas of the input. Experimental evaluations on several widely used public datasets, including AffectNet, RAF-DB, and FERPlus, demonstrate the superiority of the DDAMFN compared to other existing models, which establishes that the DDAMFN as the state-of-the-art model in the field of FER.

PDF Abstract
Task Dataset Model Metric Name Metric Value Global Rank Benchmark
Facial Expression Recognition (FER) AffectNet DDAMFN++ Accuracy (7 emotion) 67.36 # 5
Accuracy (8 emotion) 65.04 # 1
Facial Expression Recognition (FER) AffectNet DDAMFN Accuracy (7 emotion) 67.03 # 6
Accuracy (8 emotion) 64.25 # 2
Facial Expression Recognition (FER) FER+ DDAMFN Accuracy 90.74 # 3
Facial Expression Recognition (FER) RAF-DB DDAMFN++ Overall Accuracy 92.34 # 3
Facial Expression Recognition (FER) RAF-DB DDAMFN Overall Accuracy 91.35 # 7


No methods listed for this paper. Add relevant methods here