Manifold-aware Training: Increase Adversarial Robustness with Feature Clustering
The problem of defending against adversarial attacks has attracted increasing attention in recent years. While various types of defense methods ($\textit{e.g.}$, adversarial training, detection and rejection, and recovery) were proven empirically to bring robustness to the network, their weakness was shown by later works. Inspired by the observation from the distribution properties of the features extracted by the CNNs in the feature space and their link to robustness, this work designs a novel training process called Manifold-Aware Training (MAT), which forces CNNs to learn compact features to increase robustness. The effectiveness of the proposed method is evaluated via comparisons with existing defense mechanisms, $\textit{i.e.}$, the TRADES algorithm, which has been recognized as a representative state-of-the-art technology, and the MMC method, which also aims to learn compact features. Further verification is also conducted using the attack adaptive to our method. Experimental results show that MAT-trained CNNs exhibit significantly higher performance than state-of-the-art robustness.
PDF Abstract