🔔 Share your dataset with the ML community!

Filter by Modality

Filter by Task (clear)

Filter by Language

61 dataset results for Face Recognition

The LFW dataset contains 13,233 images of faces collected from the web. This dataset consists of the 5749 identities with 1680 people with two or more images. In the standard LFW evaluation protocol the verification accuracies are reported on 6000 face pairs.

784 PAPERS • 13 BENCHMARKS

CASIA-WebFace

The CASIA-WebFace dataset is used for face verification and face identification tasks. The dataset contains 494,414 face images of 10,575 real identities collected from the web.

386 PAPERS • 2 BENCHMARKS

MS-Celeb-1M

The MS-Celeb-1M dataset is a large-scale face recognition dataset consists of 100K identities, and each identity has about 100 facial images. The original identity labels are obtained automatically from webpages.

246 PAPERS • NO BENCHMARKS YET

Extended Yale B

The Extended Yale B database contains 2414 frontal-face images with size 192×168 over 38 subjects and about 64 images per subject. The images were captured under different lighting conditions and various facial expressions.

180 PAPERS • 1 BENCHMARK

MORPH

MORPH is a facial age estimation dataset, which contains 55,134 facial images of 13,617 subjects ranging from 16 to 77 years old.

169 PAPERS • 8 BENCHMARKS

IJB-B (IARPA Janus Benchmark-B)

The IJB-B dataset is a template-based face dataset that contains 1845 subjects with 11,754 images, 55,025 frames and 7,011 videos where a template consists of a varying number of still images and video frames from different sources. These images and videos are collected from the Internet and are totally unconstrained, with large variations in pose, illumination, image quality etc. In addition, the dataset comes with protocols for 1-to-1 template-based face verification, 1-to-N template-based open-set face identification, and 1-to-N open-set video face identification.

143 PAPERS • 5 BENCHMARKS

Adience

The Adience dataset, published in 2014, contains 26,580 photos across 2,284 subjects with a binary gender label and one label from eight different age groups, partitioned into five splits. The key principle of the data set is to capture the images as close to real world conditions as possible, including all variations in appearance, pose, lighting condition and image quality, to name a few.

115 PAPERS • 6 BENCHMARKS

VGG Face

The VGG Face dataset is face identity recognition dataset that consists of 2,622 identities. It contains over 2.6 million images.

88 PAPERS • NO BENCHMARKS YET

RFW

RFW (Racial Faces in-the-Wild)

To validate the racial bias of four commercial APIs and four state-of-the-art (SOTA) algorithms.

61 PAPERS • NO BENCHMARKS YET

Partial-REID

Partial REID is a specially designed partial person reidentification dataset that includes 600 images from 60 people, with 5 full-body images and 5 occluded images per person. These images were collected on a university campus by 6 cameras from different viewpoints, backgrounds and different types of occlusion. The examples of partial persons in the Partial REID dataset are shown in the Figure.

39 PAPERS • NO BENCHMARKS YET

CASIA-FASD

CASIA-FASD is a small face anti-spoofing dataset containing 50 subjects.

37 PAPERS • NO BENCHMARKS YET

Color FERET

The color FERET database is a dataset for face recognition. It contains 11,338 color images of size 512×768 pixels captured in a semi-controlled environment with 13 different poses from 994 subjects.

34 PAPERS • 3 BENCHMARKS

UMDFaces

UMDFaces is a face dataset divided into two parts:

30 PAPERS • NO BENCHMARKS YET

CelebA-Spoof

CelebA-Spoof is a large-scale face anti-spoofing dataset with the following properties:

26 PAPERS • NO BENCHMARKS YET

TinyFace

TinyFace is a large scale face recognition benchmark to facilitate the investigation of natively LRFR (Low Resolution Face Recognition) at large scales (large gallery population sizes) in deep learning. The TinyFace dataset consists of 5,139 labelled facial identities given by 169,403 native LR face images (average 20×16 pixels) designed for 1:N recognition test. All the LR faces in TinyFace are collected from public web data across a large variety of imaging scenarios, captured under uncontrolled viewing conditions in pose, illumination, occlusion and background.

23 PAPERS • 1 BENCHMARK

IMDb-Face

IMDb-Face is large-scale noise-controlled dataset for face recognition research. The dataset contains about 1.7 million faces, 59k identities, which is manually cleaned from 2.0 million raw images. All images are obtained from the IMDb website.

21 PAPERS • NO BENCHMARKS YET

CASIA-SURF

Dataset for face anti-spoofing in terms of both subjects and modalities. Specifically, it consists of subjects with videos and each sample has modalities (i.e., RGB, Depth and IR).

20 PAPERS • NO BENCHMARKS YET

Replay-Mobile

The Replay-Mobile Database for face spoofing consists of 1190 video clips of photo and video attack attempts to 40 clients, under different lighting conditions. These videos were recorded with current devices from the market -- an iPad Mini2 (running iOS) and a LG-G4 smartphone (running Android). This Database was produced at the Idiap Research Institute (Switzerland) within the framework of collaboration with Galician Research and Development Center in Advanced Telecommunications - Gradiant (Spain).

18 PAPERS • NO BENCHMARKS YET

WebFace260M

WebFace260M is a million-scale face benchmark, which is constructed for the research community towards closing the data gap behind the industry.

18 PAPERS • NO BENCHMARKS YET

CALFW

CALFW (Cross-Age LFW)

A renovation of Labeled Faces in the Wild (LFW), the de facto standard testbed for unconstraint face verification.

15 PAPERS • 4 BENCHMARKS

CPLFW

CPLFW (Cross-Pose LFW)

A renovation of Labeled Faces in the Wild (LFW), the de facto standard testbed for unconstraint face verification.

15 PAPERS • 4 BENCHMARKS

XQLFW (Cross-Quality Labeled Faces in the Wild)

An evaluation protocol for face verification focusing on a large intra-pair image quality difference.

15 PAPERS • 1 BENCHMARK

DigiFace-1M

DigiFace-1M is a synthetic dataset for face recognition, obtained by rendering digital faces using a computer graphics pipeline. It contains 1.22M images of 110K unique identities. The dataset consists of two parts. The first part contains 720K images with 10K identities. For each identity, 4 different sets of accessories are sampled and 18 images are rendered for each set. The second part contains 500K images with 100K identities. For each identity, only one set of accessories is sampled and only 5 images are rendered. Following the format of the existing datasets, we provide the aligned crop around the face, resized into $112 \times 112$ resolution.

14 PAPERS • NO BENCHMARKS YET

MFR (Ongoing version of ICCV-2021 Masked Face Recognition Challenge & Workshop(MFR))

During the COVID-19 coronavirus epidemic, almost everyone wears a facial mask, which poses a huge challenge to face recognition. Traditional face recognition systems may not effectively recognize the masked faces, but removing the mask for authentication will increase the risk of virus infection. Inspired by the COVID-19 pandemic response, the widespread requirement that people wear protective face masks in public places has driven a need to understand how face recognition technology deals with occluded faces, often with just the periocular area and above visible.

13 PAPERS • 1 BENCHMARK

RMFD (Real-World Masked Face Dataset)

Real-World Masked Face Dataset (RMFD) is a large dataset for masked face detection.

11 PAPERS • NO BENCHMARKS YET

DiveFace

A new face annotation dataset with balanced distribution between genders and ethnic origins.

10 PAPERS • 2 BENCHMARKS

QMUL-SurvFace

QMUL-SurvFace is a surveillance face recognition benchmark that contains 463,507 face images of 15,573 distinct identities captured in real-world uncooperative surveillance scenes over wide space and time.

10 PAPERS • 1 BENCHMARK

MeGlass

MeGlass is an eyeglass dataset originally designed for eyeglass face recognition evaluation. All the face images are selected and cleaned from MegaFace. Each identity has at least two face images with eyeglass and two face images without eyeglass. It contains 47,817 images from 1,710 different identities.

8 PAPERS • NO BENCHMARKS YET

DFW

DFW (Disguised Faces in the Wild)

Contains over 11000 images of 1000 identities with different types of disguise accessories. The dataset is collected from the Internet, resulting in unconstrained face images similar to real world settings.

7 PAPERS • 3 BENCHMARKS

DemogPairs

Although deep face recognition has achieved impressive results in recent years, there is increasing controversy regarding racial and gender bias of the models, questioning their trustworthiness and deployment into sensitive scenarios. DemogPairs is a validation set with 10.8K facial images and 58.3M identity verification pairs, distributed in demographically-balanced folds of Asian, Black and White females and males. We also propose a benchmark of experiments using DemogPairs over state-of-the-art deep face recognition models in order to analyze their cross-demographic behavior and potential demographic biases (see figure below).

7 PAPERS • NO BENCHMARKS YET

iCartoonFace

The iCartoonFace dataset is a large-scale dataset that can be used for two different tasks: cartoon face detection and cartoon face recognition.

7 PAPERS • 1 BENCHMARK

WebCaricature Dataset

Aims to facilitate research in caricature recognition. All the caricatures and face images were collected from the Web. Compared with two existing datasets, this dataset is much more challenging, with a much greater number of available images, artistic styles and larger intra-personal variations.

6 PAPERS • NO BENCHMARKS YET

iQIYI-VID

iQIYI-VID dataset, which comprises video clips from iQIYI variety shows, films, and television dramas. The whole dataset contains 500,000 videos clips of 5,000 celebrities. The length of each video is 1~30 seconds.

6 PAPERS • NO BENCHMARKS YET

IDiff-Face

The IDiff-Face dataset was proposed in the paper "IDiff-Face: Synthetic-based Face Recognition through Fizzy Identity-Conditioned Diffusion Models". This dataset is synthetically generated using the IDiff-Face model.

5 PAPERS • NO BENCHMARKS YET

MLFW (Masked LFW)

The Masked LFW (MLFW), based on Cross-Age LFW (CALFW) database, is built using a simple but effective tool that generates masked faces from unmasked faces automatically.

5 PAPERS • 1 BENCHMARK

ROF (Real World Occluded Faces)

ROF is a dataset for occluded face recognition that contains faces with both upper face occlusion, due to sunglasses, and lower face occlusion, due to masks.

5 PAPERS • NO BENCHMARKS YET

mEBAL

A multimodal database for eye blink detection and attention level estimation.

5 PAPERS • NO BENCHMARKS YET

CASIA-WebFace+masks

The COVID-19 pandemic raises the problem of adapting face recognition systems to the new reality, where people may wear surgical masks to cover their noses and mouths. Traditional data sets (e.g., CelebA, CASIA-WebFace) used for training these systems were released before the pandemic, so they now seem unsuited due to the lack of examples of people wearing masks. We propose a method for enhancing data sets containing faces without masks by creating synthetic masks and overlaying them on faces in the original images. Our method relies on Spark AR Studio, a developer program made by Facebook that is used to create Instagram face filters. In our approach, we use 9 masks of different colors, shapes and fabrics. We employ our method to generate a number of 445,446 (90%) samples of masks for the CASIA-WebFace data set.

4 PAPERS • 1 BENCHMARK

CelebA+masks

4 PAPERS • 1 BENCHMARK

IJB-S (IARPA Janus Benchmark-S)

Paper Abstract

3 PAPERS • 4 BENCHMARKS

IMDB-Clean

We have cleaned the noisy IMDB-WIKI dataset using a constrained clustering method, resulting this new benchmark for in-the-wild age estimation. The annotations also allow this dataset to use for some other tasks, like gender classification and face recognition/verification. For more details, please refer to our FPAge paper.

3 PAPERS • 1 BENCHMARK

MCXFACE (Multi-Channel Heterogeneous Face Recognition dataset)

MCXFace is a heterogeneous face recognition dataset consisting of multi-channel image samples for 51 subjects. For each subject color (RGB), thermal, near-infrared (850 nm), short-wave infrared (1300 nm), Depth, Stereo depth, and depth estimated from RGB images are available. Overall 7406 images together with landmark annotations and standard protocols are available in this dataset.

3 PAPERS • NO BENCHMARKS YET

CASIA-Face-Africa

CASIA-Face-Africa is a face image database which contains 38,546 images of 1,183 African subjects. Multi-spectral cameras are utilized to capture the face images under various illumination settings. Demographic attributes and facial expressions of the subjects are also carefully recorded. For landmark detection, each face image in the database is manually labeled with 68 facial keypoints. A group of evaluation protocols are constructed according to different applications, tasks, partitions and scenarios. The proposed database along with its face landmark annotations, evaluation protocols and preliminary results form a good benchmark to study the essential aspects of face biometrics for African subjects, especially face image preprocessing, face feature analysis and matching, facial expression recognition, sex/age estimation, ethnic classification, face image generation, etc.

2 PAPERS • NO BENCHMARKS YET

Celeb-HQ Face Gender Recognition Dataset

2 PAPERS • NO BENCHMARKS YET

Celeb-HQ Facial Identity Recognition Dataset

2 PAPERS • NO BENCHMARKS YET

KANFace (KANFace Dataset)

KANFace consists of 40K still images and 44K sequences (14.5M video frames in total) captured in unconstrained, real-world conditions from 1,045 subjects. The dataset is manually annotated in terms of identity, exact age, gender and kinship.

2 PAPERS • 1 BENCHMARK

SWAX

SWAX (Sense Wax Attack dataset)

Comprised of real human and wax figure images and videos that endorse the problem of face spoofing detection. The dataset consists of more than 1800 face images and 110 videos of 55 people/waxworks, arranged in training, validation and test sets with a large range in expression, illumination and pose variations.

2 PAPERS • NO BENCHMARKS YET

Chicago Face Database (CFD)

"The Chicago Face Database was developed at the University of Chicago by Debbie S. Ma, Joshua Correll, and Bernd Wittenbrink. The CFD is intended for use in scientific research. It provides high-resolution, standardized photographs of male and female faces of varying ethnicity between the ages of 17-65. Extensive norming data are available for each individual model. These data include both physical attributes (e.g., face size) as well as subjective ratings by independent judges (e.g., attractiveness).

1 PAPER • NO BENCHMARKS YET

Datasets

61 dataset results for Face Recognition