Search Results for author: Dylan Slack

Found 15 papers, 6 papers with code

Post Hoc Explanations of Language Models Can Improve Language Models

no code implementations • NeurIPS 2023 • Satyapriya Krishna, Jiaqi Ma, Dylan Slack, Asma Ghandeharioun, Sameer Singh, Himabindu Lakkaraju

Large Language Models (LLMs) have demonstrated remarkable capabilities in performing complex tasks.

In-Context Learning

Paper
Add Code

TABLET: Learning From Instructions For Tabular Data

1 code implementation • 25 Apr 2023 • Dylan Slack, Sameer Singh

Acquiring high-quality data is often a significant challenge in training machine learning (ML) models for tabular prediction, particularly in privacy-sensitive and costly domains like medicine and finance.

Paper
Code

TalkToModel: Explaining Machine Learning Models with Interactive Natural Language Conversations

1 code implementation • 8 Jul 2022 • Dylan Slack, Satyapriya Krishna, Himabindu Lakkaraju, Sameer Singh

In real-world evaluations with humans, 73% of healthcare workers (e. g., doctors and nurses) agreed they would use TalkToModel over baseline point-and-click systems for explainability in a disease prediction task, and 85% of ML professionals agreed TalkToModel was easier to use for computing explanations.

BIG-bench Machine Learning Disease Prediction +1

102

Paper
Code

SAFER: Data-Efficient and Safe Reinforcement Learning via Skill Acquisition

no code implementations • 10 Feb 2022 • Dylan Slack, Yinlam Chow, Bo Dai, Nevan Wichers

However, we identify these techniques are not well equipped for safe policy learning because they ignore negative experiences(e. g., unsafe or unsuccessful), focusing only on positive experiences, which harms their ability to generalize to new tasks safely.

reinforcement-learning Reinforcement Learning (RL) +2

Paper
Add Code

Rethinking Explainability as a Dialogue: A Practitioner's Perspective

1 code implementation • 3 Feb 2022 • Himabindu Lakkaraju, Dylan Slack, Yuxin Chen, Chenhao Tan, Sameer Singh

Overall, we hope our work serves as a starting place for researchers and engineers to design interactive explainability systems.

BIG-bench Machine Learning

102

Paper
Code

Feature Attributions and Counterfactual Explanations Can Be Manipulated

no code implementations • 23 Jun 2021 • Dylan Slack, Sophie Hilgard, Sameer Singh, Himabindu Lakkaraju

As machine learning models are increasingly used in critical decision-making settings (e. g., healthcare, finance), there has been a growing emphasis on developing methods to explain model predictions.

BIG-bench Machine Learning counterfactual +1

Paper
Add Code

On the Lack of Robust Interpretability of Neural Text Classifiers

no code implementations • Findings (ACL) 2021 • Muhammad Bilal Zafar, Michele Donini, Dylan Slack, Cédric Archambeau, Sanjiv Das, Krishnaram Kenthapadi

With the ever-increasing complexity of neural language models, practitioners have turned to methods for understanding the predictions of these models.

Paper
Add Code

Counterfactual Explanations Can Be Manipulated

no code implementations • NeurIPS 2021 • Dylan Slack, Sophie Hilgard, Himabindu Lakkaraju, Sameer Singh

In this work, we introduce the first framework that describes the vulnerabilities of counterfactual explanations and shows how they can be manipulated.

counterfactual Counterfactual Explanation +1

Paper
Add Code

Defuse: Harnessing Unrestricted Adversarial Examples for Debugging Models Beyond Test Accuracy

no code implementations • 11 Feb 2021 • Dylan Slack, Nathalie Rauschmayr, Krishnaram Kenthapadi

Each region contains a specific type of model bug; for instance, a misclassification region for an MNIST classifier contains a style of skinny 6 that the model mistakes as a 1.

BIG-bench Machine Learning

Paper
Add Code

Differentially Private Language Models Benefit from Public Pre-training

no code implementations • EMNLP (PrivateNLP) 2020 • Gavin Kerrigan, Dylan Slack, Jens Tuyls

Language modeling is a keystone task in natural language processing.

Language Modelling Privacy Preserving

Paper
Add Code

Reliable Post hoc Explanations: Modeling Uncertainty in Explainability

1 code implementation • NeurIPS 2021 • Dylan Slack, Sophie Hilgard, Sameer Singh, Himabindu Lakkaraju

In this paper, we address the aforementioned challenges by developing a novel Bayesian framework for generating local explanations along with their associated uncertainty.

Feature Importance

Paper
Code

Fooling LIME and SHAP: Adversarial Attacks on Post hoc Explanation Methods

2 code implementations • 6 Nov 2019 • Dylan Slack, Sophie Hilgard, Emily Jia, Sameer Singh, Himabindu Lakkaraju

Our approach can be used to scaffold any biased classifier in such a way that its predictions on the input data distribution still remain biased, but the post hoc explanations of the scaffolded classifier look innocuous.

Paper
Code

Fair Meta-Learning: Learning How to Learn Fairly

no code implementations • 6 Nov 2019 • Dylan Slack, Sorelle Friedler, Emile Givental

Data sets for fairness relevant tasks can lack examples or be biased according to a specific label in a sensitive attribute.

Attribute Fairness +1

Paper
Add Code

Fairness Warnings and Fair-MAML: Learning Fairly with Minimal Data

1 code implementation • 24 Aug 2019 • Dylan Slack, Sorelle Friedler, Emile Givental

Then, we illustrate the usefulness of both algorithms as a combined method for training models from a few data points on new tasks while using Fairness Warnings as interpretable boundary conditions under which the newly trained model may not be fair.

Fairness Meta-Learning

Paper
Code

Assessing the Local Interpretability of Machine Learning Models

no code implementations • 9 Feb 2019 • Dylan Slack, Sorelle A. Friedler, Carlos Scheidegger, Chitradeep Dutta Roy

Through a user study with 1, 000 participants, we test whether humans perform well on tasks that mimic the definitions of simulatability and "what if" local explainability on models that are typically considered locally interpretable.

BIG-bench Machine Learning

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.