One of the first datasets (if not the first) to highlight the importance of bias and diversity in the community, which started a revolution afterwards. Introduced in 2014 as integral part of a thesis of Master of Science [1,2] at Carnegie Mellon and City University of Hong Kong. It was later expanded by adding synthetic images generated by a GAN architecture at ETH Zürich (in HDCGAN by Curtó et al. 2017). Being then not only the pioneer of talking about the importance of balanced datasets for learning and vision but also for being the first GAN augmented dataset of faces.

The original description goes as follows:

A bias-free dataset, containing human faces from different ethnical groups in a wide variety of illumination conditions and image resolutions. C&Z is enhanced with HDCGAN synthetic images, thus being the first GAN augmented dataset of faces.


Supplement (with scripts to handle the labels):




Paper Code Results Date Stars

Dataset Loaders



  • Unknown