A large-scale natural dataset in English to measure stereotypical biases in four domains: gender, profession, race, and religion.
46 PAPERS • 1 BENCHMARK
300 news articles annotated with 1,727 bias spans and find evidence that informational bias appears in news articles more frequently than lexical bias.
10 PAPERS • NO BENCHMARKS YET
CI-MNIST (Correlated and Imbalanced MNIST) is a variant of MNIST dataset with introduced different types of correlations between attributes, dataset features, and an artificial eligibility criterion. For an input image $x$, the label $y \in \{1, 0\}$ indicates eligibility or ineligibility, respectively, given that $x$ is even or odd. The dataset defines the background colors as the protected or sensitive attribute $s \in \{0, 1\}$, where blue denotes the unprivileged group and red denotes the privileged group. The dataset was designed in order to evaluate bias-mitigation approaches in challenging setups and be capable of controlling different dataset configurations.
4 PAPERS • NO BENCHMARKS YET
Grep-BiasIR is a novel thoroughly-audited dataset which aim to facilitate the studies of gender bias in the retrieved results of IR systems.
2 PAPERS • NO BENCHMARKS YET
A text corpus of more than 200,000 sentences from eleven news sources regarding Donald Trump.