The Annotated Facial Landmarks in the Wild (AFLW) is a large-scale collection of annotated face images gathered from Flickr, exhibiting a large variety in appearance (e.g., pose, expression, ethnicity, age, gender) as well as general imaging and environmental conditions. In total about 25K faces are annotated with up to 21 landmarks per image.
124 PAPERS • 11 BENCHMARKS
AFLW2000-3D is a dataset of 2000 images that have been annotated with image-level 68-point 3D facial landmarks. This dataset is used for evaluation of 3D facial landmark detection models. The head poses are very diverse and often hard to be detected by a CNN-based face detector.
81 PAPERS • 8 BENCHMARKS
The dataset contains over 15K images of 20 people (6 females and 14 males - 4 people were recorded twice). For each frame, a depth image, the corresponding rgb image (both 640x480 pixels), and the annotation is provided. The head pose range covers about +-75 degrees yaw and +-60 degrees pitch. Ground truth is provided in the form of the 3D location of the head and its rotation.
22 PAPERS • 1 BENCHMARK
Biwi Kinect Head Pose is a challenging dataset mainly inspired by the automotive setup. It is acquired with the Microsoft Kinect sensor, a structured IR light device. It contains about 15k frame, with RGB. (640 × 480) and depth maps (640 × 480). Twenty subjects have been involved in the recordings: four of them were recorded twice, for a total of 24 sequences. The ground truth of yaw, pitch and roll angles is reported together with the head center and the calibration matrix.
3 PAPERS • NO BENCHMARKS YET
ICT-3DHP is collected using the Microsoft Kinect sensor and contains RGB images and depth maps of about 14k frames, divided in 10 sequences. The image resolution is 640 × 480 pixels. An hardware sensor (Polhemus Fastrack) is exploited to generate the ground truth annotation. The device is placed on a white cap worn by each subject, visible in both RGB and depth frames.
0 PAPER • NO BENCHMARKS YET