A large-scale natural dataset in English to measure stereotypical biases in four domains: gender, profession, race, and religion.
133 PAPERS • 1 BENCHMARK
300 news articles annotated with 1,727 bias spans and find evidence that informational bias appears in news articles more frequently than lexical bias.
24 PAPERS • NO BENCHMARKS YET
CI-MNIST (Correlated and Imbalanced MNIST) is a variant of MNIST dataset with introduced different types of correlations between attributes, dataset features, and an artificial eligibility criterion. For an input image $x$, the label $y \in \{1, 0\}$ indicates eligibility or ineligibility, respectively, given that $x$ is even or odd. The dataset defines the background colors as the protected or sensitive attribute $s \in \{0, 1\}$, where blue denotes the unprivileged group and red denotes the privileged group. The dataset was designed in order to evaluate bias-mitigation approaches in challenging setups and be capable of controlling different dataset configurations.
4 PAPERS • NO BENCHMARKS YET
Grep-BiasIR is a novel thoroughly-audited dataset which aim to facilitate the studies of gender bias in the retrieved results of IR systems.
A text corpus of more than 200,000 sentences from eleven news sources regarding Donald Trump.
2 PAPERS • NO BENCHMARKS YET
"The Chicago Face Database was developed at the University of Chicago by Debbie S. Ma, Joshua Correll, and Bernd Wittenbrink. The CFD is intended for use in scientific research. It provides high-resolution, standardized photographs of male and female faces of varying ethnicity between the ages of 17-65. Extensive norming data are available for each individual model. These data include both physical attributes (e.g., face size) as well as subjective ratings by independent judges (e.g., attractiveness).
1 PAPER • NO BENCHMARKS YET
TwinViews-13k is a dataset of 13,855 pairs of left-leaning and right-leaning political statements, each pair matched by topic. It was created to study political bias in reward and language models, with a focus on understanding the interaction between model alignment to truthfulness and the emergence of political bias. The dataset was generated using GPT-3.5 Turbo, with extensive auditing to ensure ideological balance and topical relevance. This dataset can be used for various tasks related to political bias, natural language processing, and model alignment, particularly in studies examining how political orientation impacts model outputs.
WEATHub is a dataset containing 24 languages. It contains words organized into groups of (target1, target2, attribute1, attribute2) to measure the association target1:target2 :: attribute1:attribute2. For example target1 can be insects, target2 can be flowers. And we might be trying to measure whether we find insects or flowers pleasant or unpleasant. The measurement of word associations is quantified using the WEAT metric in our paper. It is a metric that calculates an effect size (Cohen's d) and also provides a p-value (to measure statistical significance of the results). In our paper, we use word embeddings from language models to perform these tests and understand biased associations in language models across different languages.
The Innodata Red Teaming Prompts aims to rigorously assess models’ factuality and safety. This dataset, due to its manual creation and breadth of coverage, facilitates a comprehensive examination of LLM performance across diverse scenarios.
1 PAPER • 1 BENCHMARK