SAD: Saliency Adversarial Defense without Adversarial Training

1 Jan 2021 · Yao Zhu, Jiacheng Sun, Zewei Chen, Zhenguo Li ·

Adversarial training is one of the most effective methods for defending adversarial attacks, but it is computationally costly. In this paper, we propose Saliency Adversarial Defense (SAD), an efficient defense algorithm that avoids adversarial training. The saliency map is added to the input with a hybridization ratio to enhance those pixels that are important for making decisions. This process causes a distribution shift to the original data. Interestingly, we find that this shift can be effectively fixed by updating the statistics of batch normalization with the processed data without further training. We justify the algorithm with a linear model that the added saliency maps pull data away from its closest decision boundary. Updating BN effectively evolves the decision boundary to fit the new data. As a result, the distance between the decision boundary and the original inputs are increased such that the model is able to defend stronger attacks and thus improve robustness. Then we show in experiments that the results still hold for complex models and datasets. Our results demonstrate that SAD is superior in defending various attacks, including both white-box and black-box ones.

PDF Abstract