Interpretability

Contextual Decomposition Explanation Penalization

Introduced by Rieger et al. in Interpretations are useful: penalizing explanations to align neural networks with prior knowledge

Contextual Decomposition Explanation Penalization (CDEP) is a method which leverages existing explanation techniques for neural networks in order to prevent a model from learning unwanted relationships and ultimately improve predictive accuracy. Given particular importance scores, CDEP works by allowing the user to directly penalize importances of certain features, or interactions. This forces the neural network to not only produce the correct prediction, but also the correct explanation for that prediction

Source: Interpretations are useful: penalizing explanations to align neural networks with prior knowledge

Papers


Paper Code Results Date Stars

Categories