Improve Adversarial Robustness via Weight Penalization on Classification Layer

8 Oct 2020  ·  Cong Xu, Dan Li, Min Yang ·

It is well-known that deep neural networks are vulnerable to adversarial attacks. Recent studies show that well-designed classification parts can lead to better robustness. However, there is still much space for improvement along this line. In this paper, we first prove that, from a geometric point of view, the robustness of a neural network is equivalent to some angular margin condition of the classifier weights. We then explain why ReLU type function is not a good choice for activation under this framework. These findings reveal the limitations of the existing approaches and lead us to develop a novel light-weight-penalized defensive method, which is simple and has a good scalability. Empirical results on multiple benchmark datasets demonstrate that our method can effectively improve the robustness of the network without requiring too much additional computation, while maintaining a high classification precision for clean data.

PDF Abstract

Results from the Paper


  Submit results from this paper to get state-of-the-art GitHub badges and help the community compare results to other papers.

Methods