Attention for Adversarial Attacks: Learning from your Mistakes

In order to apply Neural Networks in safety-critical settings, such as healthcare or autonomous driving, we need to be able to analyse their robustness against adversarial attacks. As complete verification is often computationally prohibitive, we rely on cheap and effective adversarial attacks to estimate their robustness. However, state-of-the-art adversarial attacks, such as the frequently used PGD attack, often require many random restarts to generate adversarial examples. Each time we perform a restart we ignore all previous unsuccessful runs. In order to alleviate this inefficiency, we propose a method that learns from its mistakes. Specifically, our method uses Graph Neural Networks (GNNs) as an attention mechanism, to greatly reduce the search space for the attacks. The architecture of the GNN is based on the neural network we are attacking, and we perform forward and backward passes though the GNN mimicking the back-propagation algorithm of PGD attacks. The GNN outputs a smaller subspace for the PGD attack to focus on. Using our method, we manage to boost the attacks' performance: the GNN increases the success rate of PGD by over 35\% on a recent published dataset used for comparing adversarial attacks, while simultaneously reducing its average computation time.

PDF Abstract
No code implementations yet. Submit your code now

Datasets


Results from the Paper


  Submit results from this paper to get state-of-the-art GitHub badges and help the community compare results to other papers.

Methods


No methods listed for this paper. Add relevant methods here