Adversarial Defense

163 papers with code • 9 benchmarks • 5 datasets

Competitions with currently unpublished results:


Use these libraries to find Adversarial Defense models and implementations

Most implemented papers

Towards Deep Learning Models Resistant to Adversarial Attacks

MadryLab/mnist_challenge ICLR 2018

Its principled nature also enables us to identify methods for both training and attacking neural networks that are reliable and, in a certain sense, universal.

Technical Report on the CleverHans v2.1.0 Adversarial Examples Library

tensorflow/cleverhans 3 Oct 2016

An adversarial example library for constructing attacks, building defenses, and benchmarking both

Benchmarking Neural Network Robustness to Common Corruptions and Perturbations

hendrycks/robustness ICLR 2019

Then we propose a new dataset called ImageNet-P which enables researchers to benchmark a classifier's robustness to common perturbations.

The Limitations of Deep Learning in Adversarial Settings

openai/cleverhans 24 Nov 2015

In this work, we formalize the space of adversaries against deep neural networks (DNNs) and introduce a novel class of algorithms to craft adversarial samples based on a precise understanding of the mapping between inputs and outputs of DNNs.

Certified Adversarial Robustness via Randomized Smoothing

locuslab/smoothing 8 Feb 2019

We show how to turn any classifier that classifies well under Gaussian noise into a new classifier that is certifiably robust to adversarial perturbations under the $\ell_2$ norm.

Theoretically Principled Trade-off between Robustness and Accuracy

yaodongyu/TRADES 24 Jan 2019

We identify a trade-off between robustness and accuracy that serves as a guiding principle in the design of defenses against adversarial examples.

Adversarial Training for Free!

mahyarnajibi/FreeAdversarialTraining NeurIPS 2019

Adversarial training, in which a network is trained on adversarial examples, is one of the few defenses against adversarial attacks that withstands strong attacks.

ZOO: Zeroth Order Optimization based Black-box Attacks to Deep Neural Networks without Training Substitute Models

huanzhang12/ZOO-Attack 14 Aug 2017

However, different from leveraging attack transferability from substitute models, we propose zeroth order optimization (ZOO) based attacks to directly estimate the gradients of the targeted DNN for generating adversarial examples.

AOGNets: Compositional Grammatical Architectures for Deep Learning


This paper presents deep compositional grammatical architectures which harness the best of two worlds: grammar models and DNNs.

Certified Defenses against Adversarial Examples

worksheets/0xa21e7940 ICLR 2018

While neural networks have achieved high accuracy on standard image classification benchmarks, their accuracy drops to nearly zero in the presence of small adversarial perturbations to test inputs.