Search Results for author: Dan Ley

Found 11 papers, 4 papers with code

Generalized Group Data Attribution

no code implementations13 Oct 2024 Dan Ley, Suraj Srinivas, Shichang Zhang, Gili Rusak, Himabindu Lakkaraju

Data Attribution (DA) methods quantify the influence of individual training data points on model outputs and have broad applications such as explainability, data selection, and noisy label identification.

Computational Efficiency

On the Hardness of Faithful Chain-of-Thought Reasoning in Large Language Models

no code implementations15 Jun 2024 Sree Harsha Tanneru, Dan Ley, Chirag Agarwal, Himabindu Lakkaraju

In this work, we explore the promise of three broad approaches commonly employed to steer the behavior of LLMs to enhance the faithfulness of the CoT reasoning generated by LLMs: in-context learning, fine-tuning, and activation editing.

In-Context Learning Question Answering

In-Context Explainers: Harnessing LLMs for Explaining Black Box Models

1 code implementation9 Oct 2023 Nicholas Kroeger, Dan Ley, Satyapriya Krishna, Chirag Agarwal, Himabindu Lakkaraju

Despite their effectiveness in enhancing the performance of LLMs on diverse language and tabular tasks, these methods have not been thoroughly explored for their potential to generate post hoc explanations.

Explainable artificial intelligence Explainable Artificial Intelligence (XAI) +2

On Minimizing the Impact of Dataset Shifts on Actionable Explanations

no code implementations11 Jun 2023 Anna P. Meyer, Dan Ley, Suraj Srinivas, Himabindu Lakkaraju

To this end, we conduct rigorous theoretical analysis to demonstrate that model curvature, weight decay parameters while training, and the magnitude of the dataset shift are key factors that determine the extent of explanation (in)stability.

Consistent Explanations in the Face of Model Indeterminacy via Ensembling

no code implementations9 Jun 2023 Dan Ley, Leonard Tang, Matthew Nazari, Hongjin Lin, Suraj Srinivas, Himabindu Lakkaraju

This work addresses the challenge of providing consistent explanations for predictive models in the presence of model indeterminacy, which arises due to the existence of multiple (nearly) equally well-performing models for a given dataset and task.

Degraded Polygons Raise Fundamental Questions of Neural Network Perception

1 code implementation NeurIPS 2023 Leonard Tang, Dan Ley

Ultimately, we find that neural networks' behavior on this simple task conflicts with human behavior, raising a fundamental question of the robustness and learning capabilities of modern computer vision models.

GLOBE-CE: A Translation-Based Approach for Global Counterfactual Explanations

1 code implementation26 May 2023 Dan Ley, Saumitra Mishra, Daniele Magazzeni

Counterfactual explanations have been widely studied in explainability, with a range of application dependent methods prominent in fairness, recourse and model understanding.

counterfactual Fairness +1

OpenXAI: Towards a Transparent Evaluation of Model Explanations

2 code implementations22 Jun 2022 Chirag Agarwal, Dan Ley, Satyapriya Krishna, Eshika Saxena, Martin Pawelczyk, Nari Johnson, Isha Puri, Marinka Zitnik, Himabindu Lakkaraju

OpenXAI comprises of the following key components: (i) a flexible synthetic data generator and a collection of diverse real-world datasets, pre-trained models, and state-of-the-art feature attribution methods, and (ii) open-source implementations of eleven quantitative metrics for evaluating faithfulness, stability (robustness), and fairness of explanation methods, in turn providing comparisons of several explanation methods across a wide variety of metrics, models, and datasets.

Benchmarking Explainable Artificial Intelligence (XAI) +1

Global Counterfactual Explanations: Investigations, Implementations and Improvements

no code implementations14 Apr 2022 Dan Ley, Saumitra Mishra, Daniele Magazzeni

Counterfactual explanations have been widely studied in explainability, with a range of application dependent methods emerging in fairness, recourse and model understanding.

counterfactual Counterfactual Explanation +1

Diverse, Global and Amortised Counterfactual Explanations for Uncertainty Estimates

no code implementations5 Dec 2021 Dan Ley, Umang Bhatt, Adrian Weller

To interpret uncertainty estimates from differentiable probabilistic models, recent work has proposed generating a single Counterfactual Latent Uncertainty Explanation (CLUE) for a given data point where the model is uncertain, identifying a single, on-manifold change to the input such that the model becomes more certain in its prediction.

counterfactual Diversity

δ-CLUE: Diverse Sets of Explanations for Uncertainty Estimates

no code implementations13 Apr 2021 Dan Ley, Umang Bhatt, Adrian Weller

To interpret uncertainty estimates from differentiable probabilistic models, recent work has proposed generating Counterfactual Latent Uncertainty Explanations (CLUEs).

counterfactual

Cannot find the paper you are looking for? You can Submit a new open access paper.