Collaborate to Defend Against Adversarial Attacks

29 Sep 2021 · Sen Cui, Jingfeng Zhang, Jian Liang, Masashi Sugiyama, ChangShui Zhang ·

Adversarially robust learning methods require invariant predictions to a small neighborhood of its natural inputs, thus often encountering insufficient model capacity. Learning multiple models in an ensemble can mitigate this insufficiency, further improving both generalization and robustness. However, an ensemble still wastes the limited capacity of multiple models. To optimally utilizing the limited capacity, this paper proposes to learn a collaboration among multiple sub-models. Compared with the ensemble, the collaboration enables the possibility of correct predictions even if there exists a single correct sub-model. Besides, learning a collaboration could enable every sub-model to fit its own vulnerability area and reserve the rest of the sub-models to fit other vulnerability areas. To implement the idea, we propose a collaboration framework---CDA$^2$ the abbreviation for Collaborate to Defend against Adversarial Attacks. CDA$^2$ could effectively minimize the vulnerability overlap of all sub-models and then choose a representative sub-model to make correct predictions. Empirical experiments verify that CDA$^2$ outperforms various ensemble methods against black-box and white-box adversarial attacks.

PDF Abstract