Can we achieve robustness from data alone?
We introduce a meta-learning algorithm for adversarially robust classification. The proposed method tries to be as model agnostic as possible and optimizes a dataset prior to its deployment in a machine learning system, aiming to effectively erase its non-robust features. Once the dataset has been created, in principle no specialized algorithm (besides standard gradient descent) is needed to train a robust model. We formulate the data optimization procedure as a bi-level optimization problem on kernel regression, with a class of kernels that describe infinitely wide neural nets (Neural Tangent Kernels). We present extensive experiments on standard computer vision benchmarks using a variety of different models, demonstrating the effectiveness of our method, while also pointing out its current shortcomings. In parallel, we revisit prior work that also focused on the problem of data optimization for robust classification \citep{Ily+19}, and show that being robust to adversarial attacks after standard (gradient descent) training on a suitable dataset is more challenging than previously thought.
PDF Abstract