Adversarial Attack
527 papers with code • 3 benchmarks • 8 datasets
An Adversarial Attack is a technique to find a perturbation that changes the prediction of a machine learning model. The perturbation can be very small and imperceptible to human eyes.
Source: Recurrent Attention Model with Log-Polar Mapping is Robust against Adversarial Attacks
Libraries
Use these libraries to find Adversarial Attack models and implementationsDatasets
Subtasks
Most implemented papers
Towards Deep Learning Models Resistant to Adversarial Attacks
Its principled nature also enables us to identify methods for both training and attacking neural networks that are reliable and, in a certain sense, universal.
Towards Evaluating the Robustness of Neural Networks
Defensive distillation is a recently proposed approach that can take an arbitrary neural network, and increase its robustness, reducing the success rate of current attacks' ability to find adversarial examples from $95\%$ to $0. 5\%$.
Technical Report on the CleverHans v2.1.0 Adversarial Examples Library
An adversarial example library for constructing attacks, building defenses, and benchmarking both
The Limitations of Deep Learning in Adversarial Settings
In this work, we formalize the space of adversaries against deep neural networks (DNNs) and introduce a novel class of algorithms to craft adversarial samples based on a precise understanding of the mapping between inputs and outputs of DNNs.
Score-CAM: Score-Weighted Visual Explanations for Convolutional Neural Networks
Recently, increasing attention has been drawn to the internal mechanisms of convolutional neural networks, and the reason why the network makes specific decisions.
Deep Variational Information Bottleneck
We present a variational approximation to the information bottleneck of Tishby et al. (1999).
Provable defenses against adversarial examples via the convex outer adversarial polytope
We propose a method to learn deep ReLU-based classifiers that are provably robust against norm-bounded adversarial perturbations on the training data.
Theoretically Principled Trade-off between Robustness and Accuracy
We identify a trade-off between robustness and accuracy that serves as a guiding principle in the design of defenses against adversarial examples.
Foolbox: A Python toolbox to benchmark the robustness of machine learning models
Foolbox is a new Python package to generate such adversarial perturbations and to quantify and compare the robustness of machine learning models.
EAD: Elastic-Net Attacks to Deep Neural Networks via Adversarial Examples
Recent studies have highlighted the vulnerability of deep neural networks (DNNs) to adversarial examples - a visually indistinguishable adversarial image can easily be crafted to cause a well-trained model to misclassify.