Adversarial Attack

272 papers with code • 2 benchmarks • 6 datasets

An Adversarial Attack is a technique to find a perturbation that changes the prediction of a machine learning model. The perturbation can be very small and imperceptible to human eyes.

Source: Recurrent Attention Model with Log-Polar Mapping is Robust against Adversarial Attacks

Greatest papers with code

Towards Deep Learning Models Resistant to Adversarial Attacks

openai/cleverhans ICLR 2018

Its principled nature also enables us to identify methods for both training and attacking neural networks that are reliable and, in a certain sense, universal.

Adversarial Attack Adversarial Defense +2

Technical Report on the CleverHans v2.1.0 Adversarial Examples Library

tensorflow/cleverhans 3 Oct 2016

An adversarial example library for constructing attacks, building defenses, and benchmarking both

Adversarial Attack Adversarial Defense

The Limitations of Deep Learning in Adversarial Settings

cleverhans-lab/cleverhans 24 Nov 2015

In this work, we formalize the space of adversaries against deep neural networks (DNNs) and introduce a novel class of algorithms to craft adversarial samples based on a precise understanding of the mapping between inputs and outputs of DNNs.

Adversarial Attack Adversarial Defense

Score-CAM: Score-Weighted Visual Explanations for Convolutional Neural Networks

jacobgil/pytorch-grad-cam 3 Oct 2019

Recently, increasing attention has been drawn to the internal mechanisms of convolutional neural networks, and the reason why the network makes specific decisions.

Adversarial Attack Decision Making +1

Adversarial Examples on Graph Data: Deep Insights into Attack and Defense

stellargraph/stellargraph 5 Mar 2019

Based on this observation, we propose a defense approach which inspects the graph and recovers the potential adversarial perturbations.

Adversarial Attack Adversarial Defense

Fast Minimum-norm Adversarial Attacks through Adaptive Norm Constraints

bethgelab/foolbox 25 Feb 2021

Evaluating adversarial robustness amounts to finding the minimum perturbation needed to have an input sample misclassified.

Adversarial Attack

Foolbox: A Python toolbox to benchmark the robustness of machine learning models

bethgelab/foolbox 13 Jul 2017

Foolbox is a new Python package to generate such adversarial perturbations and to quantify and compare the robustness of machine learning models.

Adversarial Attack

TextAttack: A Framework for Adversarial Attacks, Data Augmentation, and Adversarial Training in NLP

QData/TextAttack EMNLP 2020

TextAttack also includes data augmentation and adversarial training modules for using components of adversarial attacks to improve model accuracy and robustness.

Adversarial Attack Adversarial Text +4

BERT-ATTACK: Adversarial Attack Against BERT Using BERT

QData/TextAttack EMNLP 2020

Adversarial attacks for discrete data (such as texts) have been proved significantly more challenging than continuous data (such as images) since it is difficult to generate adversarial samples with gradient-based methods.

Adversarial Attack

BAE: BERT-based Adversarial Examples for Text Classification

QData/TextAttack EMNLP 2020

Modern text classification models are susceptible to adversarial examples, perturbed versions of the original text indiscernible by humans which get misclassified by the model.

Adversarial Attack Adversarial Text +2