Adversarial Robustness
603 papers with code • 7 benchmarks • 9 datasets
Adversarial Robustness evaluates the vulnerabilities of machine learning models under various types of adversarial attacks.
Libraries
Use these libraries to find Adversarial Robustness models and implementationsDatasets
Most implemented papers
Adversarial Robustness as a Prior for Learned Representations
In this work, we show that robust optimization can be re-cast as a tool for enforcing priors on the features learned by deep neural networks.
On Evaluating Adversarial Robustness
Correctly evaluating defenses against adversarial examples has proven to be extremely difficult.
Unlabeled Data Improves Adversarial Robustness
We demonstrate, theoretically and empirically, that adversarial robustness can significantly benefit from semisupervised learning.
Uncovering the Limits of Adversarial Training against Norm-Bounded Adversarial Examples
In the setting with additional unlabeled data, we obtain an accuracy under attack of 65. 88% against $\ell_\infty$ perturbations of size $8/255$ on CIFAR-10 (+6. 35% with respect to prior art).
Decoupled Kullback-Leibler Divergence Loss
In this paper, we delve deeper into the Kullback-Leibler (KL) Divergence loss and observe that it is equivalent to the Doupled Kullback-Leibler (DKL) Divergence loss that consists of 1) a weighted Mean Square Error (wMSE) loss and 2) a Cross-Entropy loss incorporating soft labels.
Towards the first adversarially robust neural network model on MNIST
Despite much effort, deep neural networks remain highly susceptible to tiny input perturbations and even for MNIST, one of the most common toy datasets in computer vision, no neural network model exists for which adversarial perturbations are large and make semantic sense to humans.
GenAttack: Practical Black-box Attacks with Gradient-Free Optimization
Our experiments on different datasets (MNIST, CIFAR-10, and ImageNet) show that GenAttack can successfully generate visually imperceptible adversarial examples against state-of-the-art image recognition models with orders of magnitude fewer queries than previous approaches.
Certified Adversarial Robustness with Additive Noise
The existence of adversarial data examples has drawn significant attention in the deep-learning community; such data are seemingly minimally perturbed relative to the original data, but lead to very different outputs from a deep-learning algorithm.
advertorch v0.1: An Adversarial Robustness Toolbox based on PyTorch
advertorch is a toolbox for adversarial robustness research.
Interpolated Adversarial Training: Achieving Robust Neural Networks without Sacrificing Too Much Accuracy
Adversarial robustness has become a central goal in deep learning, both in the theory and the practice.