Interpretable Machine Learning

140 papers with code • 1 benchmarks • 2 datasets

The goal of Interpretable Machine Learning is to allow oversight and understanding of machine-learned decisions. Much of the work in Interpretable Machine Learning has come in the form of devising methods to better explain the predictions of machine learning models.

Source: Assessing the Local Interpretability of Machine Learning Models


Use these libraries to find Interpretable Machine Learning models and implementations

Most implemented papers

Grad-CAM: Visual Explanations from Deep Networks via Gradient-based Localization

ramprs/grad-cam ICCV 2017

For captioning and VQA, we show that even non-attention based models can localize inputs.

Temporal Fusion Transformers for Interpretable Multi-horizon Time Series Forecasting

google-research/google-research 19 Dec 2019

Multi-horizon forecasting problems often contain a complex mix of inputs -- including static (i. e. time-invariant) covariates, known future inputs, and other exogenous time series that are only observed historically -- without any prior information on how they interact with the target.

Axiomatic Attribution for Deep Networks

ankurtaly/Attributions ICML 2017

We study the problem of attributing the prediction of a deep network to its input features, a problem previously studied by several other works.

"Why Should I Trust You?": Explaining the Predictions of Any Classifier

marcotcr/lime-experiments 16 Feb 2016

Despite widespread adoption, machine learning models remain mostly black boxes.

SmoothGrad: removing noise by adding noise

PAIR-code/saliency 12 Jun 2017

Explaining the output of a deep network remains a challenge.

A Unified Approach to Interpreting Model Predictions

slundberg/shap NeurIPS 2017

Understanding why a model makes a certain prediction can be as crucial as the prediction's accuracy in many applications.

Learning Important Features Through Propagating Activation Differences

slundberg/shap ICML 2017

Here we present DeepLIFT (Deep Learning Important FeaTures), a method for decomposing the output prediction of a neural network on a specific input by backpropagating the contributions of all neurons in the network to every feature of the input.

BreastScreening: On the Use of Multi-Modality in Medical Imaging Diagnosis

MIMBCD-UI/prototype-multi-modality 7 Apr 2020

This paper describes the field research, design and comparative deployment of a multimodal medical imaging user interface for breast screening.

Understanding Neural Networks Through Deep Visualization

yosinski/deep-visualization-toolbox 22 Jun 2015

The first is a tool that visualizes the activations produced on each layer of a trained convnet as it processes an image or video (e. g. a live webcam stream).

Interpretable Explanations of Black Boxes by Meaningful Perturbation

ruthcfong/perturb_explanations ICCV 2017

As machine learning algorithms are increasingly applied to high impact yet high risk tasks, such as medical diagnosis or autonomous driving, it is critical that researchers can explain how such algorithms arrived at their predictions.