Adversarial Robustness
606 papers with code • 7 benchmarks • 9 datasets
Adversarial Robustness evaluates the vulnerabilities of machine learning models under various types of adversarial attacks.
Libraries
Use these libraries to find Adversarial Robustness models and implementationsDatasets
Most implemented papers
Testing Robustness Against Unforeseen Adversaries
To narrow in on this discrepancy between research and reality we introduce ImageNet-UA, a framework for evaluating model robustness against a range of unforeseen adversaries, including eighteen new non-L_p attacks.
Adversarial Weight Perturbation Helps Robust Generalization
The study on improving the robustness of deep neural networks against adversarial examples grows rapidly in recent years.
Bridging Mode Connectivity in Loss Landscapes and Adversarial Robustness
In this work, we propose to employ mode connectivity in loss landscapes to study the adversarial robustness of deep neural networks, and provide novel methods for improving this robustness.
Fast Minimum-norm Adversarial Attacks through Adaptive Norm Constraints
Evaluating adversarial robustness amounts to finding the minimum perturbation needed to have an input sample misclassified.
FedNest: Federated Bilevel, Minimax, and Compositional Optimization
Standard federated optimization methods successfully apply to stochastic problems with single-level structure.
An Embarrassingly Simple Backdoor Attack on Self-supervised Learning
As a new paradigm in machine learning, self-supervised learning (SSL) is capable of learning high-quality representations of complex data without relying on labels.
Provably Bounding Neural Network Preimages
Most work on the formal verification of neural networks has focused on bounding the set of outputs that correspond to a given set of inputs (for example, bounded perturbations of a nominal input).
A Closer Look at Memorization in Deep Networks
We examine the role of memorization in deep learning, drawing connections to capacity, generalization, and adversarial robustness.
Exploring the Landscape of Spatial Robustness
The study of adversarial robustness has so far largely focused on perturbations bound in p-norms.
Evading classifiers in discrete domains with provable optimality guarantees
We introduce a graphical framework that (1) generalizes existing attacks in discrete domains, (2) can accommodate complex cost functions beyond $p$-norms, including financial cost incurred when attacking a classifier, and (3) efficiently produces valid adversarial examples with guarantees of minimal adversarial cost.