Noise Modulation: Let Your Model Interpret Itself

19 Mar 2021  ·  Haoyang Li, Xinggang Wang ·

Given the great success of Deep Neural Networks(DNNs) and the black-box nature of it,the interpretability of these models becomes an important issue.The majority of previous research works on the post-hoc interpretation of a trained model.But recently, adversarial training shows that it is possible for a model to have an interpretable input-gradient through training.However,adversarial training lacks efficiency for interpretability.To resolve this problem, we construct an approximation of the adversarial perturbations and discover a connection between adversarial training and amplitude modulation. Based on a digital analogy,we propose noise modulation as an efficient and model-agnostic alternative to train a model that interprets itself with input-gradients.Experiment results show that noise modulation can effectively increase the interpretability of input-gradients model-agnosticly.

PDF Abstract
No code implementations yet. Submit your code now



Results from the Paper

  Submit results from this paper to get state-of-the-art GitHub badges and help the community compare results to other papers.


No methods listed for this paper. Add relevant methods here