Search Results for author: Been Kim

Found 50 papers, 21 papers with code

Bridging the Human-AI Knowledge Gap: Concept Discovery and Transfer in AlphaZero

no code implementations25 Oct 2023 Lisa Schut, Nenad Tomasev, Tom McGrath, Demis Hassabis, Ulrich Paquet, Been Kim

Artificial Intelligence (AI) systems have made remarkable progress, attaining super-human performance across various domains.

Game of Chess

Don't trust your eyes: on the (un)reliability of feature visualizations

1 code implementation7 Jun 2023 Robert Geirhos, Roland S. Zimmermann, Blair Bilodeau, Wieland Brendel, Been Kim

Today, visualization methods form the foundation of our knowledge about the internal workings of neural networks, as a type of mechanistic interpretability.

Impossibility Theorems for Feature Attribution

1 code implementation22 Dec 2022 Blair Bilodeau, Natasha Jaques, Pang Wei Koh, Been Kim

Despite a sea of interpretability methods that can produce plausible explanations, the field has also empirically seen many failure cases of such methods.

Post hoc Explanations may be Ineffective for Detecting Unknown Spurious Correlation

no code implementations ICLR 2022 Julius Adebayo, Michael Muelly, Hal Abelson, Been Kim

We investigate whether three types of post hoc model explanations--feature attribution, concept activation, and training point ranking--are effective for detecting a model's reliance on spurious signals in the training data.

Beyond Rewards: a Hierarchical Perspective on Offline Multiagent Behavioral Analysis

no code implementations17 Jun 2022 Shayegan Omidshafiei, Andrei Kapishnikov, Yannick Assogba, Lucas Dixon, Been Kim

Each year, expert-level performance is attained in increasingly-complex multiagent domains, where notable examples include Go, Poker, and StarCraft II.

Starcraft Starcraft II +1

Advanced Methods for Connectome-Based Predictive Modeling of Human Intelligence: A Novel Approach Based on Individual Differences in Cortical Topography

no code implementations NeurIPS Workshop AI4Scien 2021 Evan D. Anderson, Ramsey Wilcox, Anuj Nayak, Christopher Zwilling, Pablo Robles-Granda, Been Kim, Lav R. Varshney, Aron K. Barbey

Investigating the proposed modeling framework's efficacy, we find that advanced connectome-based predictive modeling generates neuroscience predictions that account for a significantly greater proportion of variance in general intelligence scores than previously established methods, advancing our scientific understanding of the network architecture that underlies human intelligence.

feature selection

Human-Centered Concept Explanations for Neural Networks

no code implementations25 Feb 2022 Chih-Kuan Yeh, Been Kim, Pradeep Ravikumar

We start by introducing concept explanations including the class of Concept Activation Vectors (CAV) which characterize concepts using vectors in appropriate spaces of neural activations, and discuss different properties of useful concepts, and approaches to measure the usefulness of concept vectors.

Subgoal-Based Explanations for Unreliable Intelligent Decision Support Systems

no code implementations11 Jan 2022 Devleena Das, Been Kim, Sonia Chernova

Intelligent decision support (IDS) systems leverage artificial intelligence techniques to generate recommendations that guide human users through the decision making phases of a task.

Decision Making

Analyzing a Caching Model

no code implementations13 Dec 2021 Leon Sixt, Evan Zheran Liu, Marie Pellat, James Wexler, Milad Hashemi, Been Kim, Martin Maas

Machine Learning has been successfully applied in systems applications such as memory prefetching and caching, where learned models have been shown to outperform heuristics.

Acquisition of Chess Knowledge in AlphaZero

no code implementations17 Nov 2021 Thomas McGrath, Andrei Kapishnikov, Nenad Tomašev, Adam Pearce, Demis Hassabis, Been Kim, Ulrich Paquet, Vladimir Kramnik

In this work we provide evidence that human knowledge is acquired by the AlphaZero neural network as it trains on the game of chess.

Game of Chess

Best of both worlds: local and global explanations with human-understandable concepts

no code implementations16 Jun 2021 Jessica Schrouff, Sebastien Baur, Shaobo Hou, Diana Mincu, Eric Loreaux, Ralph Blanes, James Wexler, Alan Karthikesalingam, Been Kim

While there are many methods focused on either one, few frameworks can provide both local and global explanations in a consistent manner.

DISSECT: Disentangled Simultaneous Explanations via Concept Traversals

1 code implementation ICLR 2022 Asma Ghandeharioun, Been Kim, Chun-Liang Li, Brendan Jou, Brian Eoff, Rosalind W. Picard

Explaining deep learning model inferences is a promising venue for scientific understanding, improving safety, uncovering hidden biases, evaluating fairness, and beyond, as argued by many scholars.

counterfactual Fairness +2

Debugging Tests for Model Explanations

1 code implementation NeurIPS 2020 Julius Adebayo, Michael Muelly, Ilaria Liccardi, Been Kim

For several explanation methods, we assess their ability to: detect spurious correlation artifacts (data contamination), diagnose mislabeled training examples (data contamination), differentiate between a (partially) re-initialized model and a trained one (model contamination), and detect out-of-distribution inputs (test-time contamination).

Concept Bottleneck Models

4 code implementations ICML 2020 Pang Wei Koh, Thao Nguyen, Yew Siang Tang, Stephen Mussmann, Emma Pierson, Been Kim, Percy Liang

We seek to learn models that we can interact with using high-level concepts: if the model did not think there was a bone spur in the x-ray, would it still predict severe arthritis?

On Completeness-aware Concept-Based Explanations in Deep Neural Networks

2 code implementations NeurIPS 2020 Chih-Kuan Yeh, Been Kim, Sercan O. Arik, Chun-Liang Li, Tomas Pfister, Pradeep Ravikumar

Next, we propose a concept discovery method that aims to infer a complete set of concepts that are additionally encouraged to be interpretable, which addresses the limitations of existing methods on concept explanations.

On Concept-Based Explanations in Deep Neural Networks

no code implementations25 Sep 2019 Chih-Kuan Yeh, Been Kim, Sercan Arik, Chun-Liang Li, Pradeep Ravikumar, Tomas Pfister

Next, we propose a concept discovery method that considers two additional constraints to encourage the interpretability of the discovered concepts.

Benchmarking Attribution Methods with Relative Feature Importance

2 code implementations23 Jul 2019 Mengjiao Yang, Been Kim

Despite active development, quantitative evaluation of feature attribution methods remains difficult due to the lack of ground truth: we do not know which input features are in fact important to a model.

Benchmarking Feature Importance

Explaining Classifiers with Causal Concept Effect (CaCE)

no code implementations16 Jul 2019 Yash Goyal, Amir Feder, Uri Shalit, Been Kim

To overcome this problem, we define the Causal Concept Effect (CaCE) as the causal effect of (the presence or absence of) a human-interpretable concept on a deep neural net's predictions.

Causal Inference

Neural Networks Trained on Natural Scenes Exhibit Gestalt Closure

1 code implementation4 Mar 2019 Been Kim, Emily Reif, Martin Wattenberg, Samy Bengio, Michael C. Mozer

The Gestalt laws of perceptual organization, which describe how visual elements in an image are grouped and interpreted, have traditionally been thought of as innate despite their ecological validity.

Image Classification

Towards Automatic Concept-based Explanations

2 code implementations NeurIPS 2019 Amirata Ghorbani, James Wexler, James Zou, Been Kim

Interpretability has become an important topic of research as more machine learning (ML) models are deployed and widely used to make important decisions.

Feature Importance

An Evaluation of the Human-Interpretability of Explanation

no code implementations31 Jan 2019 Isaac Lage, Emily Chen, Jeffrey He, Menaka Narayanan, Been Kim, Sam Gershman, Finale Doshi-Velez

Recent years have seen a boom in interest in machine learning systems that can provide a human-understandable rationale for their predictions or decisions.

BIG-bench Machine Learning

Interpreting Black Box Predictions using Fisher Kernels

no code implementations23 Oct 2018 Rajiv Khanna, Been Kim, Joydeep Ghosh, Oluwasanmi Koyejo

Research in both machine learning and psychology suggests that salient examples can help humans to interpret learning models.

Data Summarization

Local Explanation Methods for Deep Neural Networks Lack Sensitivity to Parameter Values

2 code implementations8 Oct 2018 Julius Adebayo, Justin Gilmer, Ian Goodfellow, Been Kim

Explaining the output of a complicated machine learning model like a deep neural network (DNN) is a central challenge in machine learning.

BIG-bench Machine Learning

Proceedings of the 2018 ICML Workshop on Human Interpretability in Machine Learning (WHI 2018)

no code implementations3 Jul 2018 Been Kim, Kush R. Varshney, Adrian Weller

This is the Proceedings of the 2018 ICML Workshop on Human Interpretability in Machine Learning (WHI 2018), which was held in Stockholm, Sweden, July 14, 2018.

BIG-bench Machine Learning

xGEMs: Generating Examplars to Explain Black-Box Models

no code implementations22 Jun 2018 Shalmali Joshi, Oluwasanmi Koyejo, Been Kim, Joydeep Ghosh

This work proposes xGEMs or manifold guided exemplars, a framework to understand black-box classifier behavior by exploring the landscape of the underlying data manifold as data points cross decision boundaries.

To Trust Or Not To Trust A Classifier

1 code implementation NeurIPS 2018 Heinrich Jiang, Been Kim, Melody Y. Guan, Maya Gupta

Knowing when a classifier's prediction can be trusted is useful in many applications and critical for safely using AI.

Topological Data Analysis

How do Humans Understand Explanations from Machine Learning Systems? An Evaluation of the Human-Interpretability of Explanation

no code implementations2 Feb 2018 Menaka Narayanan, Emily Chen, Jeffrey He, Been Kim, Sam Gershman, Finale Doshi-Velez

Recent years have seen a boom in interest in machine learning systems that can provide a human-understandable rationale for their predictions or decisions.

BIG-bench Machine Learning

TCAV: Relative concept importance testing with Linear Concept Activation Vectors

2 code implementations ICLR 2018 Been Kim, Justin Gilmer, Martin Wattenberg, Fernanda Viégas

In particular, this framework enables non-machine learning experts to express concepts of interests and test hypotheses using examples (e. g., a set of pictures that illustrate the concept).

Medical Diagnosis

Proceedings of the 2017 ICML Workshop on Human Interpretability in Machine Learning (WHI 2017)

no code implementations8 Aug 2017 Been Kim, Dmitry M. Malioutov, Kush R. Varshney, Adrian Weller

This is the Proceedings of the 2017 ICML Workshop on Human Interpretability in Machine Learning (WHI 2017), which was held in Sydney, Australia, August 10, 2017.

BIG-bench Machine Learning

Towards A Rigorous Science of Interpretable Machine Learning

no code implementations28 Feb 2017 Finale Doshi-Velez, Been Kim

As machine learning systems become ubiquitous, there has been a surge of interest in interpretable machine learning: systems that provide explanation for their outputs.

BIG-bench Machine Learning Interpretable Machine Learning +1

Examples are not enough, learn to criticize! Criticism for Interpretability

no code implementations NeurIPS 2016 Been Kim, Rajiv Khanna, Oluwasanmi O. Koyejo

Example-based explanations are widely used in the effort to improve the interpretability of highly complex distributions.

Proceedings of NIPS 2016 Workshop on Interpretable Machine Learning for Complex Systems

no code implementations28 Nov 2016 Andrew Gordon Wilson, Been Kim, William Herlands

This is the Proceedings of NIPS 2016 Workshop on Interpretable Machine Learning for Complex Systems, held in Barcelona, Spain on December 9, 2016

BIG-bench Machine Learning Interpretable Machine Learning

Proceedings of the 2016 ICML Workshop on Human Interpretability in Machine Learning (WHI 2016)

no code implementations8 Jul 2016 Been Kim, Dmitry M. Malioutov, Kush R. Varshney

This is the Proceedings of the 2016 ICML Workshop on Human Interpretability in Machine Learning (WHI 2016), which was held in New York, NY, June 23, 2016.

BIG-bench Machine Learning

The Bayesian Case Model: A Generative Approach for Case-Based Reasoning and Prototype Classification

no code implementations NeurIPS 2014 Been Kim, Cynthia Rudin, Julie Shah

We present the Bayesian Case Model (BCM), a general framework for Bayesian case-based reasoning (CBR) and prototype classification and clustering.

Classification Clustering +1

Learning About Meetings

no code implementations8 Jun 2013 Been Kim, Cynthia Rudin

Most people participate in meetings almost every day, multiple times a day.

Inferring Robot Task Plans from Human Team Meetings: A Generative Modeling Approach with Logic-Based Prior

no code implementations5 Jun 2013 Been Kim, Caleb M. Chacha, Julie Shah

We present an algorithm that reduces this translation burden by inferring the final plan from a processed form of the human team's planning conversation.

Disaster Response Translation

Cannot find the paper you are looking for? You can Submit a new open access paper.