1 code implementation • 2 Jul 2024 • Fan Wu, Emily Black, Varun Chandrasekaran
We introduce {\em generative monoculture}, a behavior observed in large language models (LLMs) characterized by a significant narrowing of model output diversity relative to available training data for a given task: for example, generating only positive book reviews for books with a mixed reception.
no code implementations • 2 Oct 2023 • Hadi Elzayn, Emily Black, Patrick Vossler, Nathanael Jo, Jacob Goldin, Daniel E. Ho
Unlike similar existing approaches, our methods take advantage of contextual information -- specifically, the relationships between a model's predictions and the probabilistic prediction of protected attributes, given the true protected attribute, and vice versa -- to provide tighter bounds on the true disparity.
no code implementations • 29 Sep 2023 • Emily Black, Rakshit Naidu, Rayid Ghani, Kit T. Rodolfa, Daniel E. Ho, Hoda Heidari
While algorithmic fairness is a thriving area of research, in practice, mitigating issues of bias often gets reduced to enforcing an arbitrarily chosen fairness metric, either by enforcing fairness constraints during the optimization step, post-processing model outputs, or by manipulating the training data.
no code implementations • 20 Jun 2022 • Emily Black, Hadi Elzayn, Alexandra Chouldechova, Jacob Goldin, Daniel E. Ho
First, we show how the use of more flexible machine learning (classification) methods -- as opposed to simpler models -- shifts audit burdens from high to middle-income taxpayers.
no code implementations • ICLR 2022 • Emily Black, Klas Leino, Matt Fredrikson
Recent work has shown that models trained to the same objective, and which achieve similar measures of accuracy on consistent test data, may nonetheless behave very differently on individual predictions.
no code implementations • ICLR 2022 • Emily Black, Zifan Wang, Matt Fredrikson, Anupam Datta
Counterfactual examples are one of the most commonly-cited methods for explaining the predictions of machine learning models in key areas such as finance and medical diagnosis.
no code implementations • 21 Jul 2021 • Emily Black, Matt Fredrikson
We introduce leave-one-out unfairness, which characterizes how likely a model's prediction for an individual will change due to the inclusion or removal of a single other person in the model's training data.
1 code implementation • 21 Jun 2019 • Emily Black, Samuel Yeom, Matt Fredrikson
We present FlipTest, a black-box technique for uncovering discrimination in classifiers.
no code implementations • ICLR 2019 • Klas Leino, Emily Black, Matt Fredrikson, Shayak Sen, Anupam Datta
This overestimation gives rise to feature-wise bias amplification -- a previously unreported form of bias that can be traced back to the features of a trained model.