Explaining Classifiers using Adversarial Perturbations on the Perceptual Ball

CVPR 2021 · Andrew Elliott, Stephen Law, Chris Russell ·

We present a simple regularization of adversarial perturbations based upon the perceptual loss. While the resulting perturbations remain imperceptible to the human eye, they differ from existing adversarial perturbations in that they are semi-sparse alterations that highlight objects and regions of interest while leaving the background unaltered. As a semantically meaningful adverse perturbations, it forms a bridge between counterfactual explanations and adversarial perturbations in the space of images. We evaluate our approach on several standard explainability benchmarks, namely, weak localization, insertion deletion, and the pointing game demonstrating that perceptually regularized counterfactuals are an effective explanation for image-based classifiers.

PDF Abstract CVPR 2021 PDF CVPR 2021 Abstract

Code

Add Remove Mark official

No code implementations yet. Submit your code now

Tasks

Add Remove

counterfactual

Datasets

Add Datasets introduced or used in this paper

Results from the Paper

Edit

Submit results from this paper to get state-of-the-art GitHub badges and help the community compare results to other papers.

Methods

Add Remove

Counterfactuals

Edit Social Preview

Explaining Classifiers using Adversarial Perturbations on the Perceptual Ball

Code Edit Add Remove Mark official

Tasks Edit Add Remove

Datasets Edit

Results from the Paper Edit

Methods Edit Add Remove

Code

Add Remove Mark official

Tasks

Add Remove

Datasets

Results from the Paper

Edit

Methods

Add Remove