387 papers with code • 3 benchmarks • 7 datasets

An Adversarial Attack is a technique to find a perturbation that changes the prediction of a machine learning model. The perturbation can be very small and imperceptible to human eyes.

## Libraries

Use these libraries to find Adversarial Attack models and implementations
6 papers
66
3 papers
5,553
3 papers
5,553
3 papers
5,552

# Towards Deep Learning Models Resistant to Adversarial Attacks

Its principled nature also enables us to identify methods for both training and attacking neural networks that are reliable and, in a certain sense, universal.

49

# Towards Evaluating the Robustness of Neural Networks

16 Aug 2016

Defensive distillation is a recently proposed approach that can take an arbitrary neural network, and increase its robustness, reducing the success rate of current attacks' ability to find adversarial examples from $95\%$ to $0. 5\%$.

26

# Technical Report on the CleverHans v2.1.0 Adversarial Examples Library

3 Oct 2016

An adversarial example library for constructing attacks, building defenses, and benchmarking both

13

# The Limitations of Deep Learning in Adversarial Settings

24 Nov 2015

In this work, we formalize the space of adversaries against deep neural networks (DNNs) and introduce a novel class of algorithms to craft adversarial samples based on a precise understanding of the mapping between inputs and outputs of DNNs.

11

# Deep Variational Information Bottleneck

1 Dec 2016

We present a variational approximation to the information bottleneck of Tishby et al. (1999).

8

# Provable defenses against adversarial examples via the convex outer adversarial polytope

We propose a method to learn deep ReLU-based classifiers that are provably robust against norm-bounded adversarial perturbations on the training data.

8

# Score-CAM: Score-Weighted Visual Explanations for Convolutional Neural Networks

3 Oct 2019

Recently, increasing attention has been drawn to the internal mechanisms of convolutional neural networks, and the reason why the network makes specific decisions.

8

# Foolbox: A Python toolbox to benchmark the robustness of machine learning models

13 Jul 2017

Foolbox is a new Python package to generate such adversarial perturbations and to quantify and compare the robustness of machine learning models.

6

13 Sep 2017

Recent studies have highlighted the vulnerability of deep neural networks (DNNs) to adversarial examples - a visually indistinguishable adversarial image can easily be crafted to cause a well-trained model to misclassify.

6

# Boosting Adversarial Attacks with Momentum

To further improve the success rates for black-box attacks, we apply momentum iterative algorithms to an ensemble of models, and show that the adversarially trained models with a strong defense ability are also vulnerable to our black-box attacks.

6