no code implementations • NeurIPS 2023 • Satyapriya Krishna, Jiaqi Ma, Dylan Slack, Asma Ghandeharioun, Sameer Singh, Himabindu Lakkaraju
Large Language Models (LLMs) have demonstrated remarkable capabilities in performing complex tasks.
1 code implementation • 25 Apr 2023 • Dylan Slack, Sameer Singh
Acquiring high-quality data is often a significant challenge in training machine learning (ML) models for tabular prediction, particularly in privacy-sensitive and costly domains like medicine and finance.
1 code implementation • 8 Jul 2022 • Dylan Slack, Satyapriya Krishna, Himabindu Lakkaraju, Sameer Singh
In real-world evaluations with humans, 73% of healthcare workers (e. g., doctors and nurses) agreed they would use TalkToModel over baseline point-and-click systems for explainability in a disease prediction task, and 85% of ML professionals agreed TalkToModel was easier to use for computing explanations.
no code implementations • 10 Feb 2022 • Dylan Slack, Yinlam Chow, Bo Dai, Nevan Wichers
However, we identify these techniques are not well equipped for safe policy learning because they ignore negative experiences(e. g., unsafe or unsuccessful), focusing only on positive experiences, which harms their ability to generalize to new tasks safely.
1 code implementation • 3 Feb 2022 • Himabindu Lakkaraju, Dylan Slack, Yuxin Chen, Chenhao Tan, Sameer Singh
Overall, we hope our work serves as a starting place for researchers and engineers to design interactive explainability systems.
no code implementations • 23 Jun 2021 • Dylan Slack, Sophie Hilgard, Sameer Singh, Himabindu Lakkaraju
As machine learning models are increasingly used in critical decision-making settings (e. g., healthcare, finance), there has been a growing emphasis on developing methods to explain model predictions.
no code implementations • Findings (ACL) 2021 • Muhammad Bilal Zafar, Michele Donini, Dylan Slack, Cédric Archambeau, Sanjiv Das, Krishnaram Kenthapadi
With the ever-increasing complexity of neural language models, practitioners have turned to methods for understanding the predictions of these models.
no code implementations • NeurIPS 2021 • Dylan Slack, Sophie Hilgard, Himabindu Lakkaraju, Sameer Singh
In this work, we introduce the first framework that describes the vulnerabilities of counterfactual explanations and shows how they can be manipulated.
no code implementations • 11 Feb 2021 • Dylan Slack, Nathalie Rauschmayr, Krishnaram Kenthapadi
Each region contains a specific type of model bug; for instance, a misclassification region for an MNIST classifier contains a style of skinny 6 that the model mistakes as a 1.
no code implementations • EMNLP (PrivateNLP) 2020 • Gavin Kerrigan, Dylan Slack, Jens Tuyls
Language modeling is a keystone task in natural language processing.
1 code implementation • NeurIPS 2021 • Dylan Slack, Sophie Hilgard, Sameer Singh, Himabindu Lakkaraju
In this paper, we address the aforementioned challenges by developing a novel Bayesian framework for generating local explanations along with their associated uncertainty.
2 code implementations • 6 Nov 2019 • Dylan Slack, Sophie Hilgard, Emily Jia, Sameer Singh, Himabindu Lakkaraju
Our approach can be used to scaffold any biased classifier in such a way that its predictions on the input data distribution still remain biased, but the post hoc explanations of the scaffolded classifier look innocuous.
no code implementations • 6 Nov 2019 • Dylan Slack, Sorelle Friedler, Emile Givental
Data sets for fairness relevant tasks can lack examples or be biased according to a specific label in a sensitive attribute.
1 code implementation • 24 Aug 2019 • Dylan Slack, Sorelle Friedler, Emile Givental
Then, we illustrate the usefulness of both algorithms as a combined method for training models from a few data points on new tasks while using Fairness Warnings as interpretable boundary conditions under which the newly trained model may not be fair.
no code implementations • 9 Feb 2019 • Dylan Slack, Sorelle A. Friedler, Carlos Scheidegger, Chitradeep Dutta Roy
Through a user study with 1, 000 participants, we test whether humans perform well on tasks that mimic the definitions of simulatability and "what if" local explainability on models that are typically considered locally interpretable.