Sparse Adversarial Attack via Perturbation Factorization

This work studies the sparse adversarial attack, which aims to generate adversarial perturbations onto partial positions of one benign image, such that the perturbed image is incorrectly predicted by one deep neural network (DNN) model. The sparse adversarial attack involves two challenges, i.e., where to perturb, and how to determine the perturbation magnitude. Many existing works determined the perturbed positions manually or heuristically, and then optimized the magnitude using a proper algorithm designed for the dense adversarial attack. In this work, we propose to factorize the perturbation at each pixel to the product of two variables, including the perturbation magnitude and one binary selection factor (i.e., 0 or 1). One pixel is perturbed if its selection factor is 1, otherwise not perturbed. Based on this factorization, we formulate the sparse attack problem as a mixed integer programming (MIP) to jointly optimize the binary selection factors and continuous perturbation magnitudes of all pixels, with a cardinality constraint on selection factors to explicitly control the degree of sparsity. Besides, the perturbation factorization provides the extra flexibility to incorporate other meaningful constraints on selection factors or magnitudes to achieve some desired performance, such as the group-wise sparsity or the enhanced visual imperceptibility. We develop an efficient algorithm by equivalently reformulating the MIP problem as a continuous optimization problem. Experiments on benchmark databases demonstrate the superiority of the proposed method over several state-of-the-art sparse attack methods.

PDF Abstract

Results from the Paper


  Submit results from this paper to get state-of-the-art GitHub badges and help the community compare results to other papers.

Methods


No methods listed for this paper. Add relevant methods here