Adversarial Attack Detection

7 papers with code • 0 benchmarks • 0 datasets

The detection of adversarial attacks.

Most implemented papers

Maximum Mean Discrepancy Test is Aware of Adversarial Attacks

Sjtubrian/SAMMD 22 Oct 2020

However, it has been shown that the MMD test is unaware of adversarial attacks -- the MMD test failed to detect the discrepancy between natural and adversarial data.

Gotta Catch 'Em All: Using Honeypots to Catch Adversarial Attacks on Neural Networks

Shawn-Shan/trapdoor 18 Apr 2019

Attackers' optimization algorithms gravitate towards trapdoors, leading them to produce attacks similar to trapdoors in the feature space.

Reverse KL-Divergence Training of Prior Networks: Improved Uncertainty and Adversarial Robustness

KaosEngineer/PriorNetworks NeurIPS 2019

Second, taking advantage of this new training criterion, this paper investigates using Prior Networks to detect adversarial attacks and proposes a generalized form of adversarial training.

MetaAdvDet: Towards Robust Detection of Evolving Adversarial Attacks

machanic/MetaAdvDet 6 Aug 2019

To solve such few-shot problem with the evolving attack, we propose a meta-learning based robust detection method to detect new adversarial attacks with limited examples.

Towards Feature Space Adversarial Attack

qiulingxu/FeatureSpaceAttack 26 Apr 2020

We propose a new adversarial attack to Deep Neural Networks for image classification.

Is RobustBench/AutoAttack a suitable Benchmark for Adversarial Robustness?

adverml/spectraldef_framework AAAI Workshop AdvML 2022

In its most commonly reported sub-task, RobustBench evaluates and ranks the adversarial robustness of trained neural networks on CIFAR10 under AutoAttack (Croce and Hein 2020b) with l-inf perturbations limited to eps = 8/255.

Residue-Based Natural Language Adversarial Attack Detection

rainavyas/naacl-2022-residue-detector 17 Apr 2022

Many popular image adversarial detection approaches are able to identify adversarial examples from embedding feature spaces, whilst in the NLP domain existing state of the art detection approaches solely focus on input text features, without consideration of model embedding spaces.