1 code implementation • 3 Apr 2024 • Kamalika Chaudhuri, Chuan Guo, Laurens van der Maaten, Saeed Mahloujifar, Mark Tygert
The HCR bounds appear to be insufficient on their own to guarantee confidentiality of the inputs to inference with standard deep neural nets, "ResNet-18" and "Swin-T," pre-trained on the data set, "ImageNet-1000," which contains 1000 classes.
no code implementations • 7 Mar 2024 • Shengyuan Hu, Saeed Mahloujifar, Virginia Smith, Kamalika Chaudhuri, Chuan Guo
Data-dependent privacy accounting frameworks such as per-instance differential privacy (pDP) and Fisher information loss (FIL) confer fine-grained privacy guarantees for individuals in a fixed training dataset.
no code implementations • 9 Jan 2024 • Xinyu Tang, Ashwinee Panda, Milad Nasr, Saeed Mahloujifar, Prateek Mittal
We introduce DP-ZO, a new method for fine-tuning large language models that preserves the privacy of training data by privatizing zeroth-order optimization.
no code implementations • 27 Oct 2023 • Jaiden Fairoze, Sanjam Garg, Somesh Jha, Saeed Mahloujifar, Mohammad Mahmoody, Mingyuan Wang
We construct the first provable watermarking scheme for language models with public detectability or verifiability: we use a private key for watermarking and a public key for watermark detection.
no code implementations • 17 Apr 2023 • Jiachen T. Wang, Saeed Mahloujifar, Tong Wu, Ruoxi Jia, Prateek Mittal
In this paper, we propose a new differential privacy paradigm called estimate-verify-release (EVR), which addresses the challenges of providing a strict upper bound for privacy parameter in DP compositions by converting an estimate of privacy parameter into a formal guarantee.
no code implementations • 21 Feb 2023 • Sihui Dai, Saeed Mahloujifar, Chong Xiang, Vikash Sehwag, Pin-Yu Chen, Prateek Mittal
Using our framework, we present the first leaderboard, MultiRobustBench, for benchmarking multiattack evaluation which captures performance across attack types and attack strengths.
1 code implementation • ICLR 2023 • Xiangyu Qi, Tinghao Xie, Tinghao_Xie1, Yiming Li, Saeed Mahloujifar, Prateek Mittal
This latent separation is so pervasive that a family of backdoor defenses directly take it as a default assumption (dubbed latent separability assumption), based on which to identify poison samples via cluster analysis in the latent space.
no code implementations • 29 Jan 2023 • Tong Wu, Feiran Jia, Xiangyu Qi, Jiachen T. Wang, Vikash Sehwag, Saeed Mahloujifar, Prateek Mittal
Recently, test-time adaptation (TTA) has been proposed as a promising solution for addressing distribution shifts.
no code implementations • 8 Dec 2022 • Ashwinee Panda, Xinyu Tang, Vikash Sehwag, Saeed Mahloujifar, Prateek Mittal
A major direction in differentially private machine learning is differentially private fine-tuning: pretraining a model on a source of "public data" and transferring the extracted features to downstream tasks.
no code implementations • 16 Sep 2022 • Jiachen T. Wang, Saeed Mahloujifar, Shouda Wang, Ruoxi Jia, Prateek Mittal
As an application of our analysis, we show that PTR and our theoretical results can be used to design differentially private variants for byzantine robust training algorithms that use robust statistics for gradients aggregation.
no code implementations • 27 Aug 2022 • Sanjam Garg, Somesh Jha, Saeed Mahloujifar, Mohammad Mahmoody, Mingyuan Wang
In particular, for computationally bounded learners, we extend the recent result of Bubeck and Sellke [NeurIPS'2021] which shows that robust models might need more parameters, to the computational regime and show that bounded learners could provably need an even larger number of parameters.
no code implementations • 22 Jul 2022 • Tong Wu, Tianhao Wang, Vikash Sehwag, Saeed Mahloujifar, Prateek Mittal
Our attack can be easily deployed in the real world since it only requires rotating the object, as we show in both image classification and object detection applications.
2 code implementations • 26 May 2022 • Xiangyu Qi, Tinghao Xie, Jiachen T. Wang, Tong Wu, Saeed Mahloujifar, Prateek Mittal
First, we uncover a post-hoc workflow underlying most prior work, where defenders passively allow the attack to proceed and then leverage the characteristics of the post-attacked model to uncover poison samples.
1 code implementation • 26 May 2022 • Xiangyu Qi, Tinghao Xie, Yiming Li, Saeed Mahloujifar, Prateek Mittal
This latent separation is so pervasive that a family of backdoor defenses directly take it as a default assumption (dubbed latent separability assumption), based on which to identify poison samples via cluster analysis in the latent space.
1 code implementation • 28 Apr 2022 • Sihui Dai, Saeed Mahloujifar, Prateek Mittal
Based on our generalization bound, we propose variation regularization (VR) which reduces variation of the feature extractor across the source threat model during training.
no code implementations • 12 Apr 2022 • Saeed Mahloujifar, Alexandre Sablayrolles, Graham Cormode, Somesh Jha
A common countermeasure against MI attacks is to utilize differential privacy (DP) during model training to mask the presence of individual examples.
1 code implementation • 3 Feb 2022 • Chong Xiang, Alexander Valtchanov, Saeed Mahloujifar, Prateek Mittal
An attacker can use a single physically-realizable adversarial patch to make the object detector miss the detection of victim objects and undermine the functionality of object detection applications.
1 code implementation • 12 Dec 2021 • Ashwinee Panda, Saeed Mahloujifar, Arjun N. Bhagoji, Supriyo Chakraborty, Prateek Mittal
Federated learning is inherently vulnerable to model poisoning attacks because its decentralized nature allows attackers to participate with compromised devices.
no code implementations • 15 Oct 2021 • Xinyu Tang, Saeed Mahloujifar, Liwei Song, Virat Shejwalkar, Milad Nasr, Amir Houmansadr, Prateek Mittal
The goal of this work is to train ML models that have high membership privacy while largely preserving their utility; we therefore aim for an empirical membership privacy guarantee as opposed to the provable privacy guarantees provided by techniques like differential privacy, as such techniques are shown to deteriorate model utility.
no code implementations • 11 Oct 2021 • Sihui Dai, Saeed Mahloujifar, Prateek Mittal
To address this, we analyze the direct impact of activation shape on robustness through PAFs and observe that activation shapes with positive outputs on negative inputs and with high finite curvature can increase robustness.
1 code implementation • 20 Aug 2021 • Chong Xiang, Saeed Mahloujifar, Prateek Mittal
Remarkably, PatchCleanser achieves 83. 9% top-1 clean accuracy and 62. 1% top-1 certified robust accuracy against a 2%-pixel square patch anywhere on the image for the 1000-class ImageNet dataset.
no code implementations • 21 Jun 2021 • Saeed Mahloujifar, Huseyin A. Inan, Melissa Chase, Esha Ghosh, Marcello Hasegawa
Indeed, our attack is a cheaper membership inference attack on text-generative models, which does not require the knowledge of the target model or any expensive training of text-generative models as shadow models.
2 code implementations • ICLR 2022 • Vikash Sehwag, Saeed Mahloujifar, Tinashe Handina, Sihui Dai, Chong Xiang, Mung Chiang, Prateek Mittal
We circumvent this challenge by using additional data from proxy distributions learned by advanced generative models.
no code implementations • 26 Jan 2021 • Melissa Chase, Esha Ghosh, Saeed Mahloujifar
In this work, we study property inference in scenarios where the adversary can maliciously control part of the training data (poisoning data) with the goal of increasing the leakage.
2 code implementations • 10 Nov 2020 • Nicholas Carlini, Samuel Deng, Sanjam Garg, Somesh Jha, Saeed Mahloujifar, Mohammad Mahmoody, Shuang Song, Abhradeep Thakurta, Florian Tramer
A private machine learning algorithm hides as much as possible about its training data while still preserving accuracy.
1 code implementation • 30 Jun 2020 • Fnu Suya, Saeed Mahloujifar, Anshuman Suri, David Evans, Yuan Tian
Our attack is the first model-targeted poisoning attack that provides provable convergence for convex models, and in our experiments, it either exceeds or matches state-of-the-art attacks in terms of attack success rate and distance to the target model.
no code implementations • NeurIPS 2021 • Samuel Deng, Sanjam Garg, Somesh Jha, Saeed Mahloujifar, Mohammad Mahmoody, Abhradeep Thakurta
Some of the stronger poisoning attacks require the full knowledge of the training data.
no code implementations • 11 Jul 2019 • Omid Etesami, Saeed Mahloujifar, Mohammad Mahmoody
Product measures of dimension $n$ are known to be concentrated in Hamming distance: for any set $S$ in the product space of probability $\epsilon$, a random point in the space, with probability $1-\delta$, has a neighbor in $S$ that is different from the original point in only $O(\sqrt{n\ln(1/(\epsilon\delta))})$ coordinates.
no code implementations • 13 Jun 2019 • Dimitrios I. Diochnos, Saeed Mahloujifar, Mohammad Mahmoody
In this work, we initiate a formal study of probably approximately correct (PAC) learning under evasion attacks, where the adversary's goal is to \emph{misclassify} the adversarially perturbed sample point $\widetilde{x}$, i. e., $h(\widetilde{x})\neq c(\widetilde{x})$, where $c$ is the ground truth concept and $h$ is the learned hypothesis.
1 code implementation • NeurIPS 2019 • Saeed Mahloujifar, Xiao Zhang, Mohammad Mahmoody, David Evans
Many recent works have shown that adversarial examples that fool classifiers can be found by minimally perturbing a normal input.
no code implementations • 28 May 2019 • Sanjam Garg, Somesh Jha, Saeed Mahloujifar, Mohammad Mahmoody
On the reverse directions, we also show that the existence of such learning task in which computational robustness beats information theoretic robustness requires computational hardness by implying (average-case) hardness of NP.
no code implementations • NeurIPS 2018 • Dimitrios I. Diochnos, Saeed Mahloujifar, Mohammad Mahmoody
We study both "inherent" bounds that apply to any problem and any classifier for such a problem as well as bounds that apply to specific problems and specific hypothesis classes.
no code implementations • 2 Oct 2018 • Saeed Mahloujifar, Mohammad Mahmoody
Making learners robust to adversarial perturbation at test time (i. e., evasion attacks) or training time (i. e., poisoning attacks) has emerged as a challenging task.
no code implementations • 10 Sep 2018 • Saeed Mahloujifar, Mohammad Mahmoody, Ameer Mohammed
In this work, we demonstrate universal multi-party poisoning attacks that adapt and apply to any multi-party learning process with arbitrary interaction pattern between the parties.
no code implementations • 9 Sep 2018 • Saeed Mahloujifar, Dimitrios I. Diochnos, Mohammad Mahmoody
We show that if the metric probability space of the test instance is concentrated, any classifier with some initial constant error is inherently vulnerable to adversarial perturbations.
no code implementations • 10 Nov 2017 • Saeed Mahloujifar, Dimitrios I. Diochnos, Mohammad Mahmoody
They obtained $p$-tampering attacks that increase the error probability in the so called targeted poisoning model in which the adversary's goal is to increase the loss of the trained hypothesis over a particular test example.