Estimating the head pose of a person is a crucial problem that has a large amount of applications such as aiding in gaze estimation, modeling attention, fitting 3D models to video and performing face alignment.
|TREND||DATASET||BEST METHOD||PAPER TITLE||PAPER||CODE||COMPARE|
Through multi-task learning mechanism, the recognition network explores the dependencies among multiple face analysis tasks, such as facial landmark localization, head pose estimation, gender recognition and face attribute estimation from image representation-level.
Although automatic gaze estimation is very important to a large variety of application areas, it is difficult to train accurate and robust gaze models, in great part due to the difficulty in collecting large and diverse data (annotating 3D gaze is expensive and existing datasets use different setups).
We are the first to present domain adaptation for head pose estimation with a focus on partially shared and continuous label spaces.
With an ever-increasing number of mobile devices competing for our attention, quantifying when, how often, or for how long users visually attend to their devices has emerged as a core challenge in mobile human-computer interaction.
Specifically, the rectangular coordinates of only four non-coplanar feature points from a predefined 3D facial model as well as the corresponding ones automatically/ manually extracted from a 2D face image are first normalized to exclude the effect of external factors (i. e., scale factor and translation parameters).
Head poses are a key component of human bodily communication and thus a decisive element of human-computer interaction.
We address a problem of estimating pose of a person's head from its RGB image.
In addition, we evaluated the impact of the validation loss on the landmark accuracy based on uniform sampling.
In this paper, we address the problem of how to robustly train a ConvNet for regression, or deep robust regression.