FairFace: Face Attribute Dataset for Balanced Race, Gender, and Age

14 Aug 2019  ·  Kimmo Kärkkäinen, Jungseock Joo ·

Existing public face datasets are strongly biased toward Caucasian faces, and other races (e.g., Latino) are significantly underrepresented. This can lead to inconsistent model accuracy, limit the applicability of face analytic systems to non-White race groups, and adversely affect research findings based on such skewed data. To mitigate the race bias in these datasets, we construct a novel face image dataset, containing 108,501 images, with an emphasis of balanced race composition in the dataset. We define 7 race groups: White, Black, Indian, East Asian, Southeast Asian, Middle East, and Latino. Images were collected from the YFCC-100M Flickr dataset and labeled with race, gender, and age groups. Evaluations were performed on existing face attribute datasets as well as novel image datasets to measure generalization performance. We find that the model trained from our dataset is substantially more accurate on novel datasets and the accuracy is consistent between race and gender groups.

PDF Abstract

Datasets


Introduced in the Paper:

FairFace

Used in the Paper:

CelebA LFW UTKFace YFCC100M MORPH LFWA
Task Dataset Model Metric Name Metric Value Global Rank Result Benchmark
Facial Attribute Classification FairFace FairFace race-top1 93.7 # 1
gender-top1 94.2 # 3
age-top1 59.7 # 3

Methods


No methods listed for this paper. Add relevant methods here