An Efficient Pre-processing Method to Eliminate Adversarial Effects

Deep Neural Networks (DNNs) are vulnerable to adversarial examples generated by imposing subtle perturbations to inputs that lead a model to predict incorrect outputs. Currently, a large number of researches on defending adversarial examples pay little attention to the real-world applications, either with high computational complexity or poor defensive effects. Motivated by this observation, we develop an efficient preprocessing method to defend adversarial images. Specifically, before an adversarial example is fed into the model, we perform two image transformations: WebP compression, which is utilized to remove the small adversarial noises. Flip operation, which flips the image once along one side of the image to destroy the specific structure of adversarial perturbations. Finally, a de-perturbed sample is obtained and can be correctly classified by DNNs. Experimental results on ImageNet show that our method outperforms the state-of-the-art defense methods. It can effectively defend adversarial attacks while ensure only very small accuracy drop on normal images.

Results in Papers With Code
(↓ scroll down to see all results)