Defense against Adversarial Attacks Using High-Level Representation Guided Denoiser

CVPR 2018 Fangzhou Liao • Ming Liang • Yinpeng Dong • Tianyu Pang • Xiaolin Hu • Jun Zhu

HGD overcomes this problem by using a loss function defined as the difference between the target model's outputs activated by the clean image and denoised image. First, with HGD as a defense, the target model is more robust to either white-box or black-box adversarial attacks. Second, HGD can be trained on a small subset of the images and generalizes well to other images and unseen classes.

