BiasBios (Bias in Bios)

Introduced by De-Arteaga et al. in Bias in Bios: A Case Study of Semantic Representation Bias in a High-Stakes Setting

The purpose of this dataset was to study gender bias in occupations. Online biographies, written in English, were collected to find the names, pronouns, and occupations. Twenty-eight most frequent occupations were identified based on their appearances. The resulting dataset consists of 397,340 biographies spanning twenty-eight different occupations. Of these occupations, the professor is the most frequent, with 118,400 biographies, while the rapper is the least frequent, with 1,406 biographies. Important information about the biographies: 1. The longest biography is 194 tokens, while the shortest is eighteen; the median biography length is seventy-two tokens. 2. It should be noted that the demographics of online biographies’ subjects differ from those of the overall workforce and that this dataset does not contain all biographies on the Internet.

Papers


Paper Code Results Date Stars

Dataset Loaders


Tasks


Similar Datasets


License


  • Unknown

Modalities


Languages