no code implementations • 2 Aug 2024 • Hannah Chen, Yangfeng Ji, David Evans
Large language models (LLMs) are now being considered and even deployed for applications that support high-stakes decision-making, such as recruitment and clinical decisions.
1 code implementation • 29 Jun 2024 • Bogdan Ruszczak, Krzysztof Kotowski, David Evans, Jakub Nalepa
Detecting anomalous events in satellite telemetry is a critical task in space operations.
2 code implementations • 17 Jun 2024 • Anshuman Suri, Xiao Zhang, David Evans
Membership inference attacks are used as a key tool for disclosure auditing.
1 code implementation • 30 Mar 2024 • Hannah Chen, Yangfeng Ji, David Evans
Statistical fairness stipulates equivalent outcomes for every protected group, whereas causal fairness prescribes that a model makes the same prediction for an individual regardless of their protected characteristics.
1 code implementation • 12 Feb 2024 • Michael Duan, Anshuman Suri, Niloofar Mireshghallah, Sewon Min, Weijia Shi, Luke Zettlemoyer, Yulia Tsvetkov, Yejin Choi, David Evans, Hannaneh Hajishirzi
Membership inference attacks (MIAs) attempt to predict whether a particular datapoint is a member of a target model's training data.
no code implementations • 20 Nov 2023 • Evan Rose, Fnu Suya, David Evans
Machine learning is susceptible to poisoning attacks, in which an attacker controls a small fraction of the training data and chooses that data with the goal of inducing some behavior unintended by the model developer in the trained model.
1 code implementation • 26 Oct 2023 • Fnu Suya, Anshuman Suri, Tingwei Zhang, Jingtao Hong, Yuan Tian, David Evans
However, these works make different assumptions on the adversary's knowledge and current literature lacks a cohesive organization centered around the threat model.
no code implementations • 24 Oct 2023 • Valentin Hartmann, Anshuman Suri, Vincent Bindschaedler, David Evans, Shruti Tople, Robert West
A major part of this success is due to their huge training datasets and the unprecedented number of model parameters, which allow them to memorize large amounts of information contained in the training data.
1 code implementation • CVPR 2023 • Yulong Tian, Fnu Suya, Anshuman Suri, Fengyuan Xu, David Evans
We demonstrate attacks in which an adversary can manipulate the upstream model to conduct highly effective and specific property inference attacks (AUC score $> 0. 9$), without incurring significant performance loss on the main task.
1 code implementation • 6 Jan 2023 • Hojjat Aghakhani, Wei Dai, Andre Manoel, Xavier Fernandes, Anant Kharkar, Christopher Kruegel, Giovanni Vigna, David Evans, Ben Zorn, Robert Sim
To achieve this, prior attacks explicitly inject the insecure code payload into the training data, making the poison data detectable by static analysis tools that can remove such malicious data from the training set.
no code implementations • 21 Dec 2022 • Ahmed Salem, Giovanni Cherubin, David Evans, Boris Köpf, Andrew Paverd, Anshuman Suri, Shruti Tople, Santiago Zanella-Béguelin
Deploying machine learning models in production may allow adversaries to infer sensitive information about training data.
2 code implementations • 15 Dec 2022 • Anshuman Suri, Yifu Lu, Yanjin Chen, David Evans
A distribution inference attack aims to infer statistical properties of data used to train machine learning models.
1 code implementation • 20 Oct 2022 • Hannah Chen, Yangfeng Ji, David Evans
Traditional (fickle) adversarial examples involve finding a small perturbation that does not change an input's true label but confuses the classifier into outputting a different prediction.
1 code implementation • 2 Sep 2022 • Bargav Jayaraman, David Evans
Our main conclusions are: (1) previous attribute inference methods do not reveal more about the training data from the model than can be inferred by an adversary without access to the trained model, but with the same knowledge of the underlying distribution as needed to train the attribute inference attack; (2) black-box attribute inference attacks rarely learn anything that cannot be learned without the model; but (3) white-box attacks, which we introduce and evaluate in the paper, can reliably identify some records with the sensitive value attribute that would not be predicted without having access to the model.
no code implementations • 14 Jul 2022 • Bargav Jayaraman, Esha Ghosh, Melissa Chase, Sambuddha Roy, Wei Dai, David Evans
We show experimentally that it is possible for an adversary to extract sensitive user information present in the training data, even in realistic settings where all interactions with the model must go through a front-end that limits the types of queries.
1 code implementation • 25 May 2022 • FatemehSadat Mireshghallah, Archit Uniyal, Tianhao Wang, David Evans, Taylor Berg-Kirkpatrick
Large language models are shown to present privacy risks through memorization of training data, and several recent works have studied such risks for the pre-training phase.
2 code implementations • 13 Sep 2021 • Anshuman Suri, David Evans
Distribution inference attacks can pose serious risks when models are trained on private data, but are difficult to distinguish from the intrinsic purpose of statistical machine learning -- namely, to produce models that capture statistical properties about a distribution.
1 code implementation • ICLR 2022 • Xiao Zhang, David Evans
A fundamental question in adversarial machine learning is whether a robust classifier exists for a given task.
2 code implementations • 7 Jun 2021 • Anshuman Suri, David Evans
Property inference attacks reveal statistical properties about a training set but are difficult to distinguish from the primary purposes of statistical machine learning, which is to produce models that capture statistical properties about a distribution.
1 code implementation • 30 Apr 2021 • Yulong Tian, Fnu Suya, Fengyuan Xu, David Evans
In a backdoor attack on a machine learning model, an adversary produces a model that performs well on normal inputs but outputs targeted misclassifications on inputs containing a small trigger pattern.
1 code implementation • ICLR 2021 • Jack Prescott, Xiao Zhang, David Evans
Mahloujifar et al. presented an empirical way to measure the concentration of a data distribution using samples, and employed it to find lower bounds on intrinsic robustness for several benchmark datasets.
1 code implementation • Findings of the Association for Computational Linguistics 2020 • Hannah Chen, Yangfeng Ji, David Evans
Most NLP datasets are manually labeled, so suffer from inconsistent labeling or limited size.
1 code implementation • 30 Jun 2020 • Fnu Suya, Saeed Mahloujifar, Anshuman Suri, David Evans, Yuan Tian
Our attack is the first model-targeted poisoning attack that provides provable convergence for convex models, and in our experiments, it either exceeds or matches state-of-the-art attacks in terms of attack success rate and distance to the target model.
no code implementations • ACL 2020 • Hannah Chen, Yangfeng Ji, David Evans
The prevailing approach for training and evaluating paraphrase identification models is constructed as a binary classification problem: the model is given a pair of sentences, and is judged by how accurately it classifies pairs as either paraphrases or non-paraphrases.
1 code implementation • 21 May 2020 • Bargav Jayaraman, Lingxiao Wang, Katherine Knipmeyer, Quanquan Gu, David Evans
Since previous inference attacks fail in imbalanced prior setting, we develop a new inference attack based on the intuition that inputs corresponding to training set members will be near a local minimum in the loss function, and show that an attack that combines this with thresholds on the per-instance loss can achieve high PPV even in settings where other attacks appear to be ineffective.
1 code implementation • 21 Apr 2020 • Mainuddin Ahmad Jonas, David Evans
Deep Neural Networks (DNNs) are often vulnerable to adversarial examples. Several proposed defenses deploy an ensemble of models with the hope that, although the individual models may be vulnerable, an adversary will not be able to find an adversarial example that succeeds against the ensemble.
1 code implementation • 20 Mar 2020 • Anshuman Suri, David Evans
Despite vast research in adversarial examples, the root causes of model susceptibility are not well understood.
1 code implementation • 1 Mar 2020 • Xiao Zhang, Jinghui Chen, Quanquan Gu, David Evans
Starting with Gilmer et al. (2018), several works have demonstrated the inevitability of adversarial examples based on different assumptions about the underlying input probability space.
2 code implementations • ICML 2020 • Sicheng Zhu, Xiao Zhang, David Evans
We develop a notion of representation vulnerability that captures the maximum change of mutual information between the input and output distributions, under the worst-case input perturbation.
9 code implementations • 10 Dec 2019 • Peter Kairouz, H. Brendan McMahan, Brendan Avent, Aurélien Bellet, Mehdi Bennis, Arjun Nitin Bhagoji, Kallista Bonawitz, Zachary Charles, Graham Cormode, Rachel Cummings, Rafael G. L. D'Oliveira, Hubert Eichner, Salim El Rouayheb, David Evans, Josh Gardner, Zachary Garrett, Adrià Gascón, Badih Ghazi, Phillip B. Gibbons, Marco Gruteser, Zaid Harchaoui, Chaoyang He, Lie He, Zhouyuan Huo, Ben Hutchinson, Justin Hsu, Martin Jaggi, Tara Javidi, Gauri Joshi, Mikhail Khodak, Jakub Konečný, Aleksandra Korolova, Farinaz Koushanfar, Sanmi Koyejo, Tancrède Lepoint, Yang Liu, Prateek Mittal, Mehryar Mohri, Richard Nock, Ayfer Özgür, Rasmus Pagh, Mariana Raykova, Hang Qi, Daniel Ramage, Ramesh Raskar, Dawn Song, Weikang Song, Sebastian U. Stich, Ziteng Sun, Ananda Theertha Suresh, Florian Tramèr, Praneeth Vepakomma, Jianyu Wang, Li Xiong, Zheng Xu, Qiang Yang, Felix X. Yu, Han Yu, Sen Zhao
FL embodies the principles of focused data collection and minimization, and can mitigate many of the systemic privacy risks and costs resulting from traditional, centralized machine learning and data science approaches.
no code implementations • 30 Oct 2019 • Lingxiao Wang, Bargav Jayaraman, David Evans, Quanquan Gu
While many solutions for privacy-preserving convex empirical risk minimization (ERM) have been developed, privacy-preserving nonconvex ERM remains a challenge.
1 code implementation • 19 Aug 2019 • Fnu Suya, Jianfeng Chi, David Evans, Yuan Tian
In a black-box setting, the adversary only has API access to the target model and each query is expensive.
Cryptography and Security
1 code implementation • NeurIPS 2019 • Saeed Mahloujifar, Xiao Zhang, Mohammad Mahmoody, David Evans
Many recent works have shown that adversarial examples that fool classifiers can be found by minimally perturbing a normal input.
1 code implementation • 24 Feb 2019 • Bargav Jayaraman, David Evans
Differential privacy is a strong notion for privacy that can be used to prove formal guarantees, in terms of a privacy budget, $\epsilon$, about how much information is leaked by a mechanism.
1 code implementation • NeurIPS 2018 • Bargav Jayaraman, Lingxiao Wang, David Evans, Quanquan Gu
We explore two popular methods of differential privacy, output perturbation and gradient perturbation, and advance the state-of-the-art for both methods in the distributed learning setting.
1 code implementation • ICLR 2019 • Xiao Zhang, David Evans
Several recent works have developed methods for training classifiers that are certifiably robust against norm-bounded adversarial perturbations.
4 code implementations • arXiv 2018 • Hyrum S. Anderson, Anant Kharkar, Bobby Filar, David Evans, Phil Roth
We show in experiments that our method can attack a gradient-boosted machine learning model with evasion rates that are substantial and appear to be strongly dependent on the dataset.
Cryptography and Security
1 code implementation • 23 Dec 2017 • Fnu Suya, Yuan Tian, David Evans, Paolo Papotti
Specifically, we consider the problem of attacking machine learning classifiers subject to a budget of feature modification cost while minimizing the number of queries, where each query returns only a class and confidence score.
1 code implementation • 30 May 2017 • Weilin Xu, David Evans, Yanjun Qi
Feature squeezing is a recently-introduced framework for mitigating and detecting adversarial examples.
2 code implementations • Network and Distributed System Security Symposium 2018 • Weilin Xu, David Evans, Yanjun Qi
Although deep neural networks (DNNs) have achieved great success in many tasks, they can often be fooled by \emph{adversarial examples} that are generated by adding small but purposeful distortions to natural examples.