An Adversarial Attack via Feature Contributive Regions

1 Jan 2021 · Yaguan Qian, Jiamin Wang, Xiang Ling, Zhaoquan Gu, Bin Wang, Chunming Wu ·

Recently, to deal with the vulnerability to generate examples of CNNs, there are many advanced algorithms that have been proposed. These algorithms focus on modifying global pixels directly with small perturbations, and some work involves modifying local pixels. However, the global attacks have the problem of perturbations’ redundancy and the local attacks are not effective. To overcome this challenge, we achieve a trade-off between the perturbation power and the number of perturbed pixels in this paper. The key idea is to find the feature contributive regions (FCRs) of the images. Furthermore, in order to create an adversarial example similar to the corresponding clean image as much as possible, we redefine a loss function as the objective function of the optimization in this paper and then using gradient descent optimization algorithm to find the efficient perturbations. Various experiments have been carried out on CIFAR-10 and ILSVRC2012 datasets, which show the excellence of this method, and in addition, the FCRs attack shows strong attack ability in both white-box and black-box settings.

PDF Abstract