Data Preprocessing to Mitigate Bias with Boosted Fair Mollifiers

1 Dec 2020  ·  Alexander Soen, Hisham Husain, Richard Nock ·

In a recent paper, Celis et al. (2020) introduced a new approach to fairness that corrects the data distribution itself. The approach is computationally appealing, but its approximation guarantees with respect to the target distribution can be quite loose as they need to rely on a (typically limited) number of constraints on data-based aggregated statistics; also resulting in a fairness guarantee which can be data dependent. Our paper makes use of a mathematical object recently introduced in privacy -- mollifiers of distributions -- and a popular approach to machine learning -- boosting -- to get an approach in the same lineage as Celis et al. but without the same impediments, including in particular, better guarantees in terms of accuracy and finer guarantees in terms of fairness. The approach involves learning the sufficient statistics of an exponential family. When the training data is tabular, the sufficient statistics can be defined by decision trees whose interpretability can provide clues on the source of (un)fairness. Experiments display the quality of the results for simulated and real-world data.

PDF Abstract
No code implementations yet. Submit your code now


  Add Datasets introduced or used in this paper

Results from the Paper

  Submit results from this paper to get state-of-the-art GitHub badges and help the community compare results to other papers.