The VGG Face dataset is face identity recognition dataset that consists of 2,622 identities. It contains over 2.6 million images.
88 PAPERS • NO BENCHMARKS YET
IMDb-Face is large-scale noise-controlled dataset for face recognition research. The dataset contains about 1.7 million faces, 59k identities, which is manually cleaned from 2.0 million raw images.
21 PAPERS • NO BENCHMARKS YET
The IDiff-Face dataset was proposed in the paper "IDiff-Face: Synthetic-based Face Recognition through Fizzy Identity-Conditioned Diffusion Models". This dataset is synthetically generated using the IDiff-Face model.
5 PAPERS • NO BENCHMARKS YET
A new face annotation dataset with balanced distribution between genders and ethnic origins.
10 PAPERS • 2 BENCHMARKS
TinyFace is a large scale face recognition benchmark to facilitate the investigation of natively LRFR (Low Resolution Face Recognition) at large scales (large gallery population sizes) in deep learning The TinyFace dataset consists of 5,139 labelled facial identities given by 169,403 native LR face images (average 20×16 pixels) designed for 1:N recognition test. All the LR faces in TinyFace are collected from public web data across a large variety of imaging scenarios, captured under uncontrolled viewing conditions in pose, illumination, occlusion and background
23 PAPERS • 1 BENCHMARK
The iCartoonFace dataset is a large-scale dataset that can be used for two different tasks: cartoon face detection and cartoon face recognition.
7 PAPERS • 1 BENCHMARK
…"The Chicago Face Database was developed at the University of Chicago by Debbie S. Ma, Joshua Correll, and Bernd Wittenbrink. The CFD is intended for use in scientific research. It provides high-resolution, standardized photographs of male and female faces of varying ethnicity between the ages of 17-65. Extensive norming data are available for each individual model. ., face size) as well as subjective ratings by independent judges (e.g., attractiveness).
1 PAPER • NO BENCHMARKS YET
QMUL-SurvFace is a surveillance face recognition benchmark that contains 463,507 face images of 15,573 distinct identities captured in real-world uncooperative surveillance scenes over wide space and time
10 PAPERS • 1 BENCHMARK
CASIA-Face-Africa is a face image database which contains 38,546 images of 1,183 African subjects. Multi-spectral cameras are utilized to capture the face images under various illumination settings. For landmark detection, each face image in the database is manually labeled with 68 facial keypoints. The proposed database along with its face landmark annotations, evaluation protocols and preliminary results form a good benchmark to study the essential aspects of face biometrics for African subjects , especially face image preprocessing, face feature analysis and matching, facial expression recognition, sex/age estimation, ethnic classification, face image generation, etc.
2 PAPERS • NO BENCHMARKS YET
The CASIA-WebFace dataset is used for face verification and face identification tasks. The dataset contains 494,414 face images of 10,575 real identities collected from the web.
383 PAPERS • 2 BENCHMARKS
The COVID-19 pandemic raises the problem of adapting face recognition systems to the new reality, where people may wear surgical masks to cover their noses and mouths. ., CelebA, CASIA-WebFace) used for training these systems were released before the pandemic, so they now seem unsuited due to the lack of examples of people wearing masks. We propose a method for enhancing data sets containing faces without masks by creating synthetic masks and overlaying them on faces in the original images. Our method relies on Spark AR Studio, a developer program made by Facebook that is used to create Instagram face filters. In our approach, we use 9 masks of different colors, shapes and fabrics. We employ our method to generate a number of 445,446 (90%) samples of masks for the CASIA-WebFace data set.
4 PAPERS • 1 BENCHMARK
DigiFace-1M is a synthetic dataset for face recognition, obtained by rendering digital faces using a computer graphics pipeline. It contains 1.22M images of 110K unique identities. Following the format of the existing datasets, we provide the aligned crop around the face, resized into $112 \times 112$ resolution. Please visit the website for more detail.
14 PAPERS • NO BENCHMARKS YET
WebFace260M is a million-scale face benchmark, which is constructed for the research community towards closing the data gap behind the industry. It consists of: - Noisy 4M identities and 260M faces - High-quality training data with 42M images of 2M identities by using automatic cleaning - A test set with rich attributes and a time-constrained evaluation
18 PAPERS • NO BENCHMARKS YET
Description: 23 Pairs of Identical Twins Face Image Data. The collecting scenes includes indoor and outdoor scenes. The subjects are Chinese males and females. The data diversity inlcudes multiple face angles, multiple face postures, close-up of eyes, multiple light conditions and multiple age groups. This dataset can be used for tasks such as twins' face recognition.
0 PAPER • NO BENCHMARKS YET
Celeb-HQ Face Gender Recognition Dataset This dataset is curated for the face gender classification task. The dataset contains 30,000 images. There are 23,999 train images. The whole face images are divided into two classes. There are 11,057 male images. There are 18,943 female images.
Description: 1,995 People Face Images Data (Asian race). For each subject, more than 20 images per person with frontal face were collected. This data can be used for face recognition and other tasks. Data size: 1,995 people, more than 20 images per person with frontal face Race distribution: Asian people
Description: 5,011 Images – Human Frontal face Data (Male). The data diversity includes multiple scenes, multiple ages and multiple races. This dataset includes 2,004 Caucasians , 3,007 Asians. This dataset can be used for tasks such as face detection, race detection, age detection, beard category classification.
FAD is a dataset that have roughly 200,000 attribute labels for the above traits, for over 10,000 facial images.
A occluded version of the LFW dataset for occluded face recognition verification. Uses structured occlusions generated to seem more realistic.
Consists of a large number of unconstrained multi-view and partially occluded faces.
Dataset originally conceived for multi-face tracking/detection for highly crowded scenarios. In these scenarios, the face is the only part that can be used to track the individuals. All our videos present novel crowd scenes recorded at near-eye level, where faces are visible enough to be analysed at the microscopic level, while also benefiting from a macroscopic view of the crowd. It includes: Face detections of 715 unique subjects along with instructions to download the synchronized video. More than 75k face detections annotated. Our dataset may be useful for: Face tracking, especially relevant for crowded scenarios (typically from video-surveillance cameras). Heavily occluded body tracking (in many videos, only the face is mostly visible). Face recognition. Face detection for partially occluded faces.
Real-World Masked Face Dataset (RMFD) is a large dataset for masked face detection.
10 PAPERS • NO BENCHMARKS YET
MCXFace is a heterogeneous face recognition dataset consisting of multi-channel image samples for 51 subjects. The Multi-Channel Heterogeneous Face Recognition dataset (MCXFace) is derived from the HQ-WMCA dataset (https://www.idiap.ch/en/dataset/hq-wmca).
3 PAPERS • NO BENCHMARKS YET
…Traditional face recognition systems may not effectively recognize the masked faces, but removing the mask for authentication will increase the risk of virus infection. Inspired by the COVID-19 pandemic response, the widespread requirement that people wear protective face masks in public places has driven a need to understand how face recognition technology deals with occluded faces, often with just the periocular area and above visible. topic of face recognition on people wearing masks. In this workshop, we will organise Masked Face Recognition (MFR) challenge and focus on bench-marking deep face recognition methods under the existence of facial masks.
13 PAPERS • 1 BENCHMARK
ROF is a dataset for occluded face recognition that contains faces with both upper face occlusion, due to sunglasses, and lower face occlusion, due to masks.
MeGlass is an eyeglass dataset originally designed for eyeglass face recognition evaluation. All the face images are selected and cleaned from MegaFace. Each identity has at least two face images with eyeglass and two face images without eyeglass. It contains 47,817 images from 1,710 different identities.
8 PAPERS • NO BENCHMARKS YET
The proposed Extended-YouTube Faces (E-YTF) is an extension of the famous YouTube Faces (YTF) dataset and is specifically designed to further push the challenges of face recognition by addressing the problem of open-set face identification from heterogeneous data i.e. still images vs video.
Unconstrained Face Detection and Open-Set Face Recognition Challenge Paper: https://arxiv.org/abs/1708.02337 Official website: https://vast.uccs.edu/Opensetface UnOfficial website: https://exposing.ai /uccs Face detection and recognition benchmarks have shifted toward more difficult environments. Although face verification or closed-set face identification have surpassed human capabilities on some datasets, open-set identification is much more complex as it needs to reject both unknown identities and false accepts from the face detector. By contrast, open-set face recognition is currently weak and requires much more attention.
A renovation of Labeled Faces in the Wild (LFW), the de facto standard testbed for unconstraint face verification.
15 PAPERS • 4 BENCHMARKS
A renovation of Labeled Faces in the Wild (LFW), the de facto standard testbed for unconstraint face verification. There are three motivations behind the construction of CPLFW benchmark as follows: 1.Establishing a relatively more difficult database to evaluate the performance of real world face verification so the effectiveness of several face verification methods can be fully justified. 2.Continuing the intensive research on LFW with more realistic consideration on pose intra-class variation and fostering the research on cross-pose face verification in unconstrained situation. and the same identities in LFW, so one can easily apply CPLFW to evaluate the performance of face verification.
The Masked LFW (MLFW), based on Cross-Age LFW (CALFW) database, is built using a simple but effective tool that generates masked faces from unmasked faces automatically.
5 PAPERS • 1 BENCHMARK
The IJB-B dataset is a template-based face dataset that contains 1845 subjects with 11,754 images, 55,025 frames and 7,011 videos where a template consists of a varying number of still images and video In addition, the dataset comes with protocols for 1-to-1 template-based face verification, 1-to-N template-based open-set face identification, and 1-to-N open-set video face identification.
143 PAPERS • 5 BENCHMARKS
The Curated AFD dataset is a curated version of the Asian Face Dataset (AFD) for face recognition research.
The COVID-19 pandemic raises the problem of adapting face recognition systems to the new reality, where people may wear surgical masks to cover their noses and mouths. ., CelebA, CASIA-WebFace) used for training these systems were released before the pandemic, so they now seem unsuited due to the lack of examples of people wearing masks. We propose a method for enhancing data sets containing faces without masks by creating synthetic masks and overlaying them on faces in the original images. Our method relies on Spark AR Studio, a developer program made by Facebook that is used to create Instagram face filters. In our approach, we use 9 masks of different colors, shapes and fabrics.
The LFW dataset contains 13,233 images of faces collected from the web. This dataset consists of the 5749 identities with 1680 people with two or more images. In the standard LFW evaluation protocol the verification accuracies are reported on 6000 face pairs.
784 PAPERS • 13 BENCHMARKS
Comprised of real human and wax figure images and videos that endorse the problem of face spoofing detection. The dataset consists of more than 1800 face images and 110 videos of 55 people/waxworks, arranged in training, validation and test sets with a large range in expression, illumination and pose variations
UMDFaces is a face dataset divided into two parts: Still Images - 367,888 face annotations for 8,277 subjects. Part 1 - Still Images The dataset contains 367,888 face annotations for 8,277 subjects divided into 3 batches. The annotations contain human curated bounding boxes for faces and estimated pose (yaw, pitch, and roll), locations of twenty-one keypoints, and gender information generated by a pre-trained neural network
30 PAPERS • NO BENCHMARKS YET
CASIA-FASD is a small face anti-spoofing dataset containing 50 subjects.
37 PAPERS • NO BENCHMARKS YET
WildestFaces is tailored to study cross-domain recognition under a variety of adverse conditions.
An evaluation protocol for face verification focusing on a large intra-pair image quality difference. Real-world face recognition applications often deal with suboptimal image quality or resolution due to different capturing conditions such as various subject-to-camera distances, poor camera settings, Recent cross-resolution face recognition approaches used simple, arbitrary, and unrealistic down- and up-scaling techniques to measure robustness against real-world edge-cases in image quality. Thus, we propose a new standardized benchmark dataset and evaluation protocol derived from the famous Labeled Faces in the Wild (LFW). In contrast to previous derivatives, which focus on pose, age, similarity, and adversarial attacks, our Cross-Quality Labeled Faces in the Wild (XQLFW) maximizes the quality difference.
15 PAPERS • 1 BENCHMARK
…The existence of such large weak-labeled databases has gained importance in the training of face recognition algorithms. Starting with the publicly available YFCC100M, we propose a weakly-labeled subset for multi-label face recognition for self-supervised methods.
…The dataset is collected from the Internet, resulting in unconstrained face images similar to real world settings.
7 PAPERS • 3 BENCHMARKS
Dataset for face anti-spoofing in terms of both subjects and modalities. Specifically, it consists of subjects with videos and each sample has modalities (i.e., RGB, Depth and IR).
20 PAPERS • NO BENCHMARKS YET
Although deep face recognition has achieved impressive results in recent years, there is increasing controversy regarding racial and gender bias of the models, questioning their trustworthiness and deployment We also propose a benchmark of experiments using DemogPairs over state-of-the-art deep face recognition models in order to analyze their cross-demographic behavior and potential demographic biases (see
7 PAPERS • NO BENCHMARKS YET
The color FERET database is a dataset for face recognition. It contains 11,338 color images of size 512×768 pixels captured in a semi-controlled environment with 13 different poses from 994 subjects.
34 PAPERS • 3 BENCHMARKS
The Extended Yale B database contains 2414 frontal-face images with size 192×168 over 38 subjects and about 64 images per subject.
180 PAPERS • 1 BENCHMARK
The MS-Celeb-1M dataset is a large-scale face recognition dataset consists of 100K identities, and each identity has about 100 facial images.
246 PAPERS • NO BENCHMARKS YET
…All the caricatures and face images were collected from the Web.
6 PAPERS • NO BENCHMARKS YET