4 dataset results for Fairness AND Tabular

Netflix Prize consists of about 100,000,000 ratings for 17,770 movies given by 480,189 users. Each rating in the training dataset consists of four entries: user, movie, date of grade, grade. Users and movies are represented with integer IDs, while ratings range from 1 to 5.

343 PAPERS • 1 BENCHMARK

ACS PUMS

ACS PUMS stands for American Community Survey (ACS) Public Use Microdata Sample (PUMS) and has been used to construct several tabular datasets for studying fairness in machine learning:

9 PAPERS • NO BENCHMARKS YET

CI-MNIST (Correlated and Imbalanced MNIST) is a variant of MNIST dataset with introduced different types of correlations between attributes, dataset features, and an artificial eligibility criterion. For an input image $x$, the label $y \in \{1, 0\}$ indicates eligibility or ineligibility, respectively, given that $x$ is even or odd. The dataset defines the background colors as the protected or sensitive attribute $s \in \{0, 1\}$, where blue denotes the unprivileged group and red denotes the privileged group. The dataset was designed in order to evaluate bias-mitigation approaches in challenging setups and be capable of controlling different dataset configurations.

4 PAPERS • NO BENCHMARKS YET

ICLR Database (ICLR Database (with Textual Covariates))

A maintained database tracks ICLR submissions and reviews, augmented with author profiles and higher-level textual features.

1 PAPER • NO BENCHMARKS YET

Datasets

4 dataset results for Fairness AND Tabular