Facial expression and attributes recognition based on multi-task learning of lightweight neural networks

31 Mar 2021  ·  Savchenko A.V. ·

In this paper, we examine the multi-task training of lightweight convolutional neural networks for face identification and classification of facial attributes (age, gender, ethnicity) trained on cropped faces without margins. It is shown that it is still necessary to fine-tune these networks in order to predict facial expressions. Several models are presented based on MobileNet, EfficientNet and RexNet architectures. It was experimentally demonstrated that our models are characterized by the state-of-the-art emotion classification accuracy on AffectNet dataset and near state-of-the-art results in age, gender and race recognition for UTKFace dataset. Moreover, it is shown that the usage of our neural network as a feature extractor of facial regions in video frames and concatenation of several statistical functions (mean, max, etc.) leads to 4.5% higher accuracy than the previously known state-of-the-art single models for AFEW and VGAF datasets from the EmotiW challenges.

PDF Abstract
Task Dataset Model Metric Name Metric Value Global Rank Uses Extra
Training Data
Facial Expression Recognition (FER) Acted Facial Expressions In The Wild (AFEW) Multi-task EfficientNet-B0 Accuracy(on validation set) 59.27 # 6
Facial Expression Recognition (FER) AffectNet Multi-task EfficientNet-B0 Accuracy (7 emotion) 65.74 # 7
Accuracy (8 emotion) 61.32 # 10