Search Results for author: David Evans

Found 37 papers, 31 papers with code

Feature Squeezing: Detecting Adversarial Examples in Deep Neural Networks

2 code implementations Network and Distributed System Security Symposium 2018 Weilin Xu, David Evans, Yanjun Qi

Although deep neural networks (DNNs) have achieved great success in many tasks, they can often be fooled by \emph{adversarial examples} that are generated by adding small but purposeful distortions to natural examples.

Feature Squeezing Mitigates and Detects Carlini/Wagner Adversarial Examples

1 code implementation30 May 2017 Weilin Xu, David Evans, Yanjun Qi

Feature squeezing is a recently-introduced framework for mitigating and detecting adversarial examples.

Query-limited Black-box Attacks to Classifiers

1 code implementation23 Dec 2017 Fnu Suya, Yuan Tian, David Evans, Paolo Papotti

Specifically, we consider the problem of attacking machine learning classifiers subject to a budget of feature modification cost while minimizing the number of queries, where each query returns only a class and confidence score.

Bayesian Optimization BIG-bench Machine Learning

Learning to Evade Static PE Machine Learning Malware Models via Reinforcement Learning

4 code implementations arXiv 2018 Hyrum S. Anderson, Anant Kharkar, Bobby Filar, David Evans, Phil Roth

We show in experiments that our method can attack a gradient-boosted machine learning model with evasion rates that are substantial and appear to be strongly dependent on the dataset.

Cryptography and Security

Cost-Sensitive Robustness against Adversarial Examples

1 code implementation ICLR 2019 Xiao Zhang, David Evans

Several recent works have developed methods for training classifiers that are certifiably robust against norm-bounded adversarial perturbations.

General Classification

Distributed Learning without Distress: Privacy-Preserving Empirical Risk Minimization

1 code implementation NeurIPS 2018 Bargav Jayaraman, Lingxiao Wang, David Evans, Quanquan Gu

We explore two popular methods of differential privacy, output perturbation and gradient perturbation, and advance the state-of-the-art for both methods in the distributed learning setting.

Privacy Preserving

Evaluating Differentially Private Machine Learning in Practice

1 code implementation24 Feb 2019 Bargav Jayaraman, David Evans

Differential privacy is a strong notion for privacy that can be used to prove formal guarantees, in terms of a privacy budget, $\epsilon$, about how much information is leaked by a mechanism.

BIG-bench Machine Learning Privacy Preserving

Empirically Measuring Concentration: Fundamental Limits on Intrinsic Robustness

1 code implementation NeurIPS 2019 Saeed Mahloujifar, Xiao Zhang, Mohammad Mahmoody, David Evans

Many recent works have shown that adversarial examples that fool classifiers can be found by minimally perturbing a normal input.

Image Classification

Hybrid Batch Attacks: Finding Black-box Adversarial Examples with Limited Queries

1 code implementation19 Aug 2019 Fnu Suya, Jianfeng Chi, David Evans, Yuan Tian

In a black-box setting, the adversary only has API access to the target model and each query is expensive.

Cryptography and Security

Efficient Privacy-Preserving Stochastic Nonconvex Optimization

no code implementations30 Oct 2019 Lingxiao Wang, Bargav Jayaraman, David Evans, Quanquan Gu

While many solutions for privacy-preserving convex empirical risk minimization (ERM) have been developed, privacy-preserving nonconvex ERM remains a challenge.

Privacy Preserving

Learning Adversarially Robust Representations via Worst-Case Mutual Information Maximization

1 code implementation ICML 2020 Sicheng Zhu, Xiao Zhang, David Evans

We develop a notion of representation vulnerability that captures the maximum change of mutual information between the input and output distributions, under the worst-case input perturbation.

Adversarial Robustness

Understanding the Intrinsic Robustness of Image Distributions using Conditional Generative Models

1 code implementation1 Mar 2020 Xiao Zhang, Jinghui Chen, Quanquan Gu, David Evans

Starting with Gilmer et al. (2018), several works have demonstrated the inevitability of adversarial examples based on different assumptions about the underlying input probability space.

Adversarial Robustness

One Neuron to Fool Them All

1 code implementation20 Mar 2020 Anshuman Suri, David Evans

Despite vast research in adversarial examples, the root causes of model susceptibility are not well understood.

Certifying Joint Adversarial Robustness for Model Ensembles

1 code implementation21 Apr 2020 Mainuddin Ahmad Jonas, David Evans

Deep Neural Networks (DNNs) are often vulnerable to adversarial examples. Several proposed defenses deploy an ensemble of models with the hope that, although the individual models may be vulnerable, an adversary will not be able to find an adversarial example that succeeds against the ensemble.

Adversarial Robustness

Revisiting Membership Inference Under Realistic Assumptions

1 code implementation21 May 2020 Bargav Jayaraman, Lingxiao Wang, Katherine Knipmeyer, Quanquan Gu, David Evans

Since previous inference attacks fail in imbalanced prior setting, we develop a new inference attack based on the intuition that inputs corresponding to training set members will be near a local minimum in the loss function, and show that an attack that combines this with thresholds on the per-instance loss can achieve high PPV even in settings where other attacks appear to be ineffective.

Inference Attack

Pointwise Paraphrase Appraisal is Potentially Problematic

no code implementations ACL 2020 Hannah Chen, Yangfeng Ji, David Evans

The prevailing approach for training and evaluating paraphrase identification models is constructed as a binary classification problem: the model is given a pair of sentences, and is judged by how accurately it classifies pairs as either paraphrases or non-paraphrases.

Binary Classification Paraphrase Identification

Model-Targeted Poisoning Attacks with Provable Convergence

1 code implementation30 Jun 2020 Fnu Suya, Saeed Mahloujifar, Anshuman Suri, David Evans, Yuan Tian

Our attack is the first model-targeted poisoning attack that provides provable convergence for convex models, and in our experiments, it either exceeds or matches state-of-the-art attacks in terms of attack success rate and distance to the target model.

Improved Estimation of Concentration Under $\ell_p$-Norm Distance Metrics Using Half Spaces

1 code implementation ICLR 2021 Jack Prescott, Xiao Zhang, David Evans

Mahloujifar et al. presented an empirical way to measure the concentration of a data distribution using samples, and employed it to find lower bounds on intrinsic robustness for several benchmark datasets.

Stealthy Backdoors as Compression Artifacts

1 code implementation30 Apr 2021 Yulong Tian, Fnu Suya, Fengyuan Xu, David Evans

In a backdoor attack on a machine learning model, an adversary produces a model that performs well on normal inputs but outputs targeted misclassifications on inputs containing a small trigger pattern.

Backdoor Attack Model Compression +1

Formalizing Distribution Inference Risks

2 code implementations7 Jun 2021 Anshuman Suri, David Evans

Property inference attacks reveal statistical properties about a training set but are difficult to distinguish from the primary purposes of statistical machine learning, which is to produce models that capture statistical properties about a distribution.

Understanding Intrinsic Robustness Using Label Uncertainty

1 code implementation ICLR 2022 Xiao Zhang, David Evans

A fundamental question in adversarial machine learning is whether a robust classifier exists for a given task.

Adversarial Robustness Classification +1

Formalizing and Estimating Distribution Inference Risks

2 code implementations13 Sep 2021 Anshuman Suri, David Evans

Distribution inference attacks can pose serious risks when models are trained on private data, but are difficult to distinguish from the intrinsic purpose of statistical machine learning -- namely, to produce models that capture statistical properties about a distribution.

Inference Attack

Memorization in NLP Fine-tuning Methods

1 code implementation25 May 2022 FatemehSadat Mireshghallah, Archit Uniyal, Tianhao Wang, David Evans, Taylor Berg-Kirkpatrick

Large language models are shown to present privacy risks through memorization of training data, and several recent works have studied such risks for the pre-training phase.

Memorization

Combing for Credentials: Active Pattern Extraction from Smart Reply

no code implementations14 Jul 2022 Bargav Jayaraman, Esha Ghosh, Melissa Chase, Sambuddha Roy, Wei Dai, David Evans

We show experimentally that it is possible for an adversary to extract sensitive user information present in the training data, even in realistic settings where all interactions with the model must go through a front-end that limits the types of queries.

Language Modelling

Are Attribute Inference Attacks Just Imputation?

1 code implementation2 Sep 2022 Bargav Jayaraman, David Evans

Our main conclusions are: (1) previous attribute inference methods do not reveal more about the training data from the model than can be inferred by an adversary without access to the trained model, but with the same knowledge of the underlying distribution as needed to train the attribute inference attack; (2) black-box attribute inference attacks rarely learn anything that cannot be learned without the model; but (3) white-box attacks, which we introduce and evaluate in the paper, can reliably identify some records with the sensitive value attribute that would not be predicted without having access to the model.

Attribute Imputation +1

Balanced Adversarial Training: Balancing Tradeoffs between Fickleness and Obstinacy in NLP Models

1 code implementation20 Oct 2022 Hannah Chen, Yangfeng Ji, David Evans

Traditional (fickle) adversarial examples involve finding a small perturbation that does not change an input's true label but confuses the classifier into outputting a different prediction.

Contrastive Learning Natural Language Inference +1

Dissecting Distribution Inference

2 code implementations15 Dec 2022 Anshuman Suri, Yifu Lu, Yanjin Chen, David Evans

A distribution inference attack aims to infer statistical properties of data used to train machine learning models.

Inference Attack

TrojanPuzzle: Covertly Poisoning Code-Suggestion Models

1 code implementation6 Jan 2023 Hojjat Aghakhani, Wei Dai, Andre Manoel, Xavier Fernandes, Anant Kharkar, Christopher Kruegel, Giovanni Vigna, David Evans, Ben Zorn, Robert Sim

To achieve this, prior attacks explicitly inject the insecure code payload into the training data, making the poison data detectable by static analysis tools that can remove such malicious data from the training set.

Data Poisoning

Manipulating Transfer Learning for Property Inference

1 code implementation CVPR 2023 Yulong Tian, Fnu Suya, Anshuman Suri, Fengyuan Xu, David Evans

We demonstrate attacks in which an adversary can manipulate the upstream model to conduct highly effective and specific property inference attacks (AUC score $> 0. 9$), without incurring significant performance loss on the main task.

Transfer Learning

SoK: Memorization in General-Purpose Large Language Models

no code implementations24 Oct 2023 Valentin Hartmann, Anshuman Suri, Vincent Bindschaedler, David Evans, Shruti Tople, Robert West

A major part of this success is due to their huge training datasets and the unprecedented number of model parameters, which allow them to memorize large amounts of information contained in the training data.

Memorization Question Answering

SoK: Pitfalls in Evaluating Black-Box Attacks

1 code implementation26 Oct 2023 Fnu Suya, Anshuman Suri, Tingwei Zhang, Jingtao Hong, Yuan Tian, David Evans

However, these works make different assumptions on the adversary's knowledge and current literature lacks a cohesive organization centered around the threat model.

Understanding Variation in Subpopulation Susceptibility to Poisoning Attacks

no code implementations20 Nov 2023 Evan Rose, Fnu Suya, David Evans

Machine learning is susceptible to poisoning attacks, in which an attacker controls a small fraction of the training data and chooses that data with the goal of inducing some behavior unintended by the model developer in the trained model.

Addressing Both Statistical and Causal Gender Fairness in NLP Models

1 code implementation30 Mar 2024 Hannah Chen, Yangfeng Ji, David Evans

Statistical fairness stipulates equivalent outcomes for every protected group, whereas causal fairness prescribes that a model makes the same prediction for an individual regardless of their protected characteristics.

counterfactual Data Augmentation +1

Cannot find the paper you are looking for? You can Submit a new open access paper.