Bayesian Importance of Features (BIF)

26 Oct 2020  ·  Kamil Adamczewski, Frederik Harder, Mijung Park ·

We introduce a simple and intuitive framework that provides quantitative explanations of statistical models through the probabilistic assessment of input feature importance. The core idea comes from utilizing the Dirichlet distribution to define the importance of input features and learning it via approximate Bayesian inference. The learned importance has probabilistic interpretation and provides the relative significance of each input feature to a model's output, additionally assessing confidence about its importance quantification. As a consequence of using the Dirichlet distribution over the explanations, we can define a closed-form divergence to gauge the similarity between learned importance under different models. We use this divergence to study the feature importance explainability tradeoffs with essential notions in modern machine learning, such as privacy and fairness. Furthermore, BIF can work on two levels: global explanation (feature importance across all data instances) and local explanation (individual feature importance for each data instance). We show the effectiveness of our method on a variety of synthetic and real datasets, taking into account both tabular and image datasets. The code is available at

PDF Abstract


  Add Datasets introduced or used in this paper

Results from the Paper

  Submit results from this paper to get state-of-the-art GitHub badges and help the community compare results to other papers.


No methods listed for this paper. Add relevant methods here