The MS-Celeb-1M dataset is a large-scale face recognition dataset consists of 100K identities, and each identity has about 100 facial images. The original identity labels are obtained automatically from webpages.
NOTE: This dataset is currently inactive.
Source: Learning to Cluster Faces on an Affinity Graph