Search Results for author: David Wagner

Found 30 papers, 24 papers with code

Generative AI Security: Challenges and Countermeasures

no code implementations20 Feb 2024 Banghua Zhu, Norman Mu, Jiantao Jiao, David Wagner

Generative AI's expanding footprint across numerous industries has led to both excitement and increased scrutiny.

PAL: Proxy-Guided Black-Box Attack on Large Language Models

1 code implementation15 Feb 2024 Chawin Sitawarin, Norman Mu, David Wagner, Alexandre Araujo

In this work, we introduce the Proxy-Guided Attack on LLMs (PAL), the first optimization-based attack on LLMs in a black-box query-only setting.

Jatmo: Prompt Injection Defense by Task-Specific Finetuning

1 code implementation29 Dec 2023 Julien Piet, Maha Alrashed, Chawin Sitawarin, Sizhe Chen, Zeming Wei, Elizabeth Sun, Basel Alomair, David Wagner

Jatmo only needs a task prompt and a dataset of inputs for the task: it uses the teacher model to generate outputs.

Instruction Following

Mark My Words: Analyzing and Evaluating Language Model Watermarks

1 code implementation1 Dec 2023 Julien Piet, Chawin Sitawarin, Vivian Fang, Norman Mu, David Wagner

The capabilities of large language models have grown significantly in recent years and so too have concerns about their misuse.

Language Modelling

Can LLMs Follow Simple Rules?

1 code implementation6 Nov 2023 Norman Mu, Sarah Chen, Zifan Wang, Sizhe Chen, David Karamardian, Lulwa Aljeraisy, Dan Hendrycks, David Wagner

As Large Language Models (LLMs) are deployed with increasing real-world responsibilities, it is important to be able to specify and constrain the behavior of these systems in a reliable manner.

Defending Against Transfer Attacks From Public Models

1 code implementation26 Oct 2023 Chawin Sitawarin, Jaewon Chang, David Huang, Wesson Altoyan, David Wagner

We evaluate the transfer attacks in this setting and propose a specialized defense method based on a game-theoretic perspective.

DiverseVul: A New Vulnerable Source Code Dataset for Deep Learning Based Vulnerability Detection

1 code implementation1 Apr 2023 Yizheng Chen, Zhoujie Ding, Lamya Alowain, Xinyun Chen, David Wagner

Combining our new dataset with previous datasets, we present an analysis of the challenges and promising research directions of using deep learning for detecting software vulnerabilities.

Feature Engineering Vulnerability Detection

Continuous Learning for Android Malware Detection

2 code implementations8 Feb 2023 Yizheng Chen, Zhoujie Ding, David Wagner

We propose a new hierarchical contrastive learning scheme, and a new sample selection technique to continuously train the Android malware classifier.

Active Learning Android Malware Detection +2

REAP: A Large-Scale Realistic Adversarial Patch Benchmark

1 code implementation ICCV 2023 Nabeel Hingun, Chawin Sitawarin, Jerry Li, David Wagner

In this work, we propose the REAP (REalistic Adversarial Patch) benchmark, a digital benchmark that allows the user to evaluate patch attacks on real images, and under real-world conditions.

Part-Based Models Improve Adversarial Robustness

1 code implementation15 Sep 2022 Chawin Sitawarin, Kornrapat Pongmala, Yizheng Chen, Nicholas Carlini, David Wagner

We show that combining human prior knowledge with end-to-end learning can improve the robustness of deep neural networks by introducing a part-based model for object classification.

Adversarial Robustness

SLIP: Self-supervision meets Language-Image Pre-training

1 code implementation23 Dec 2021 Norman Mu, Alexander Kirillov, David Wagner, Saining Xie

Across ImageNet and a battery of additional datasets, we find that SLIP improves accuracy by a large margin.

Multi-Task Learning Representation Learning +1

Learning Security Classifiers with Verified Global Robustness Properties

1 code implementation24 May 2021 Yizheng Chen, Shiqi Wang, Yue Qin, Xiaojing Liao, Suman Jana, David Wagner

Since data distribution shift is very common in security applications, e. g., often observed for malware detection, local robustness cannot guarantee that the property holds for unseen inputs at the time of deploying the classifier.

Malware Detection

Adversarial Examples for k-Nearest Neighbor Classifiers Based on Higher-Order Voronoi Diagrams

no code implementations NeurIPS 2021 Chawin Sitawarin, Evgenios M. Kornaropoulos, Dawn Song, David Wagner

On a high level, the search radius expands to the nearby higher-order Voronoi cells until we find a cell that classifies differently from the input point.

Adversarial Robustness

Model-Agnostic Defense for Lane Detection against Adversarial Attack

1 code implementation1 Mar 2021 Henry Xu, An Ju, David Wagner

Susceptibility of neural networks to adversarial attack prompts serious safety concerns for lane detection efforts, a domain where such models have been widely applied.

Adversarial Attack Autonomous Driving +1

Adversarial Examples for $k$-Nearest Neighbor Classifiers Based on Higher-Order Voronoi Diagrams

1 code implementation NeurIPS 2021 Chawin Sitawarin, Evgenios M. Kornaropoulos, Dawn Song, David Wagner

On a high level, the search radius expands to the nearby Voronoi cells until we find a cell that classifies differently from the input point.

Adversarial Robustness

Minority Reports Defense: Defending Against Adversarial Patches

no code implementations28 Apr 2020 Michael McCoyd, Won Park, Steven Chen, Neil Shah, Ryan Roggenkemper, Minjune Hwang, Jason Xinyu Liu, David Wagner

We propose a defense against patch attacks based on partially occluding the image around each candidate patch location, so that a few occlusions each completely hide the patch.

Adversarial Attack General Classification +1

SAT: Improving Adversarial Training via Curriculum-Based Loss Smoothing

no code implementations18 Mar 2020 Chawin Sitawarin, Supriyo Chakraborty, David Wagner

This leads to a significant improvement in both clean accuracy and robustness compared to AT, TRADES, and other baselines.

Adversarial Robustness

Minimum-Norm Adversarial Examples on KNN and KNN-Based Models

1 code implementation14 Mar 2020 Chawin Sitawarin, David Wagner

We study the robustness against adversarial examples of kNN classifiers and classifiers that combine kNN with neural networks.

Stateful Detection of Black-Box Adversarial Attacks

1 code implementation12 Jul 2019 Steven Chen, Nicholas Carlini, David Wagner

This is true even when, as is the case in many practical settings, the classifier is hosted as a remote service and so the adversary does not have direct access to the model parameters.

Defending Against Adversarial Examples with K-Nearest Neighbor

1 code implementation23 Jun 2019 Chawin Sitawarin, David Wagner

With our models, the mean perturbation norm required to fool our MNIST model is 3. 07 and 2. 30 on CIFAR-10.

On the Robustness of Deep K-Nearest Neighbors

2 code implementations20 Mar 2019 Chawin Sitawarin, David Wagner

Despite a large amount of attention on adversarial examples, very few works have demonstrated an effective defense against this threat.

Obfuscated Gradients Give a False Sense of Security: Circumventing Defenses to Adversarial Examples

4 code implementations ICML 2018 Anish Athalye, Nicholas Carlini, David Wagner

We identify obfuscated gradients, a kind of gradient masking, as a phenomenon that leads to a false sense of security in defenses against adversarial examples.

Adversarial Attack Adversarial Defense

MagNet and "Efficient Defenses Against Adversarial Attacks" are Not Robust to Adversarial Examples

1 code implementation22 Nov 2017 Nicholas Carlini, David Wagner

MagNet and "Efficient Defenses..." were recently proposed as a defense to adversarial examples.

Adversarial Examples Are Not Easily Detected: Bypassing Ten Detection Methods

no code implementations20 May 2017 Nicholas Carlini, David Wagner

Neural networks are known to be vulnerable to adversarial examples: inputs that are close to natural inputs but classified incorrectly.

Towards Evaluating the Robustness of Neural Networks

26 code implementations16 Aug 2016 Nicholas Carlini, David Wagner

Defensive distillation is a recently proposed approach that can take an arbitrary neural network, and increase its robustness, reducing the success rate of current attacks' ability to find adversarial examples from $95\%$ to $0. 5\%$.

Adversarial Attack Test

Spoofing 2D Face Detection: Machines See People Who Aren't There

no code implementations6 Aug 2016 Michael McCoyd, David Wagner

Machine learning is increasingly used to make sense of the physical world yet may suffer from adversarial manipulation.

BIG-bench Machine Learning Face Detection

Defensive Distillation is Not Robust to Adversarial Examples

1 code implementation14 Jul 2016 Nicholas Carlini, David Wagner

We show that defensive distillation is not secure: it is no more resistant to targeted misclassification attacks than unprotected neural networks.

Cannot find the paper you are looking for? You can Submit a new open access paper.