Shape Preserving Facial Landmarks with Graph Attention Networks

13 Oct 2022  Â·  AndrĂ©s Prados-Torreblanca, JosĂ© M. Buenaposada, Luis Baumela ·

Top-performing landmark estimation algorithms are based on exploiting the excellent ability of large convolutional neural networks (CNNs) to represent local appearance. However, it is well known that they can only learn weak spatial relationships. To address this problem, we propose a model based on the combination of a CNN with a cascade of Graph Attention Network regressors. To this end, we introduce an encoding that jointly represents the appearance and location of facial landmarks and an attention mechanism to weigh the information according to its reliability. This is combined with a multi-task approach to initialize the location of graph nodes and a coarse-to-fine landmark description scheme. Our experiments confirm that the proposed model learns a global representation of the structure of the face, achieving top performance in popular benchmarks on head pose and landmark estimation. The improvement provided by our model is most significant in situations involving large changes in the local appearance of landmarks.

PDF Abstract

Results from the Paper


Task Dataset Model Metric Name Metric Value Global Rank Result Benchmark
Face Alignment 300W SPIGA NME_inter-ocular (%, Full) 2.99 # 10
NME_inter-ocular (%, Common) 2.59 # 6
NME_inter-ocular (%, Challenge) 4.66 # 11
NME_inter-pupil (%, Full) 4.20 # 9
NME_inter-pupil (%, Common) 3.59 # 8
NME_inter-pupil (%, Challenge) 6.73 # 8
Facial Landmark Detection 300W SPIGA (Inter-ocular Norm) NME 2.99 # 2
Face Alignment 300W (Common) SPIGA NME 2.59 # 1
Pose Estimation 300W (Full) SPIGA MAE mean (Âș) 1.29 # 1
MAE yaw (Âș) 1.41 # 1
MAE pitch (Âș) 1.70 # 1
MAE roll (Âș) 0.77 # 1
Face Alignment 300W Split 2 SPIGA NME (box) 2.03 # 1
AUC@7 (box) 71.0 # 1
NME (inter-ocular) 3.43 # 1
AUC@8 (inter-ocular) 57.27 # 1
FR@8 (inter-ocular) 0.67 # 1
Face Alignment COFW-68 SPIGA NME (box) 2.52 # 1
AUC@7 (box) 64.1 # 1
NME (inter-ocular) 3.93 # 1
Pose Estimation MERL-RAV SPIGA MAE mean (Âș) 2.39 # 1
MAE yaw (Âș) 3.23 # 1
MAE pitch (Âș) 2.24 # 1
MAE roll (Âș) 1.71 # 1
Face Alignment MERL-RAV SPIGA NME (box) 1.51 # 1
AUC@7 (box) 78.47 # 1
Head Pose Estimation WFLW SPIGA MAE mean (Âș) 1.52 # 1
MAE yaw (Âș) 1.78 # 1
MAE pitch (Âș) 1.86 # 1
MAE roll (Âș) 0.93 # 1
Face Alignment WFLW SPIGA NME (inter-ocular) 4.06 # 6
AUC@10 (inter-ocular) 60.56 # 4
FR@10 (inter-ocular) 2.08 # 2
Face Alignment WFW (Extra Data) SPIGA NME (inter-ocular) 4.06 # 3
AUC@10 (inter-ocular) 60.56 # 3
FR@10 (inter-ocular) 2.08 # 3

Methods