Tightening the Approximation Error of Adversarial Risk with Auto Loss Function Search

9 Nov 2021  ·  Pengfei Xia, Ziqiang Li, Bin Li ·

Despite achieving great success, Deep Neural Networks (DNNs) are vulnerable to adversarial examples. How to accurately evaluate the adversarial robustness of DNNs is critical for their deployment in real-world applications. An ideal indicator of robustness is adversarial risk. Unfortunately, since it involves maximizing the 0-1 loss, calculating the true risk is technically intractable. The most common solution for this is to compute an approximate risk by replacing the 0-1 loss with a surrogate one. Some functions have been used, such as Cross-Entropy (CE) loss and Difference of Logits Ratio (DLR) loss. However, these functions are all manually designed and may not be well suited for adversarial robustness evaluation. In this paper, we leverage AutoML to tighten the error (gap) between the true and approximate risks. Our main contributions are as follows. First, AutoLoss-AR, the first method to search for surrogate losses for adversarial risk, with an elaborate search space, is proposed. The experimental results on 10 adversarially trained models demonstrate the effectiveness of the proposed method: the risks evaluated using the best-discovered losses are 0.2% to 1.6% better than those evaluated using the handcrafted baselines. Second, 5 surrogate losses with clean and readable formulas are distilled out and tested on 7 unseen adversarially trained models. These losses outperform the baselines by 0.8% to 2.4%, indicating that they can be used individually as some kind of new knowledge. Besides, the possible reasons for the better performance of these losses are explored.

PDF Abstract

Results from the Paper


  Submit results from this paper to get state-of-the-art GitHub badges and help the community compare results to other papers.

Methods


No methods listed for this paper. Add relevant methods here