Search Results for author: Prateek Mittal

Found 65 papers, 35 papers with code

Teach LLMs to Phish: Stealing Private Information from Language Models

no code implementations1 Mar 2024 Ashwinee Panda, Christopher A. Choquette-Choo, Zhengming Zhang, Yaoqing Yang, Prateek Mittal

When large language models are trained on private data, it can be a significant privacy risk for them to memorize and regurgitate sensitive information.

Assessing the Brittleness of Safety Alignment via Pruning and Low-Rank Modifications

no code implementations7 Feb 2024 Boyi Wei, Kaixuan Huang, Yangsibo Huang, Tinghao Xie, Xiangyu Qi, Mengzhou Xia, Prateek Mittal, Mengdi Wang, Peter Henderson

We develop methods to identify critical regions that are vital for safety guardrails, and that are disentangled from utility-relevant regions at both the neuron and rank levels.

Efficient Data Shapley for Weighted Nearest Neighbor Algorithms

no code implementations20 Jan 2024 Jiachen T. Wang, Prateek Mittal, Ruoxi Jia

This work aims to address an open problem in data valuation literature concerning the efficient computation of Data Shapley for weighted $K$ nearest neighbor algorithm (WKNN-Shapley).

Computational Efficiency Data Valuation +1

Private Fine-tuning of Large Language Models with Zeroth-order Optimization

no code implementations9 Jan 2024 Xinyu Tang, Ashwinee Panda, Milad Nasr, Saeed Mahloujifar, Prateek Mittal

We introduce DP-ZO, a new method for fine-tuning large language models that preserves the privacy of training data by privatizing zeroth-order optimization.

Privacy Preserving

PatchCURE: Improving Certifiable Robustness, Model Utility, and Computation Efficiency of Adversarial Patch Defenses

no code implementations19 Oct 2023 Chong Xiang, Tong Wu, Sihui Dai, Jonathan Petit, Suman Jana, Prateek Mittal

State-of-the-art defenses against adversarial patch attacks can now achieve strong certifiable robustness with a marginal drop in model utility.

Fine-tuning Aligned Language Models Compromises Safety, Even When Users Do Not Intend To!

1 code implementation5 Oct 2023 Xiangyu Qi, Yi Zeng, Tinghao Xie, Pin-Yu Chen, Ruoxi Jia, Prateek Mittal, Peter Henderson

Optimizing large language models (LLMs) for downstream use cases often involves the customization of pre-trained LLMs through further fine-tuning.

Threshold KNN-Shapley: A Linear-Time and Privacy-Friendly Approach to Data Valuation

no code implementations30 Aug 2023 Jiachen T. Wang, Yuqing Zhu, Yu-Xiang Wang, Ruoxi Jia, Prateek Mittal

Data valuation aims to quantify the usefulness of individual data sources in training machine learning (ML) models, and is a critical aspect of data-centric ML research.

Data Valuation

BaDExpert: Extracting Backdoor Functionality for Accurate Backdoor Input Detection

no code implementations23 Aug 2023 Tinghao Xie, Xiangyu Qi, Ping He, Yiming Li, Jiachen T. Wang, Prateek Mittal

We present a novel defense, against backdoor attacks on Deep Neural Networks (DNNs), wherein adversaries covertly implant malicious behaviors (backdoors) into DNNs.

Food Classification using Joint Representation of Visual and Textual Data

no code implementations3 Aug 2023 Prateek Mittal, Puneet Goyal, Joohi Chauhan

The experimental results show that the proposed network outperforms the other methods, a significant difference of 11. 57% and 6. 34% in accuracy is observed for image and text classification, respectively, when compared with the second-best performing method.

Image Classification text-classification +1

Visual Adversarial Examples Jailbreak Aligned Large Language Models

1 code implementation22 Jun 2023 Xiangyu Qi, Kaixuan Huang, Ashwinee Panda, Peter Henderson, Mengdi Wang, Prateek Mittal

Recently, there has been a surge of interest in integrating vision into Large Language Models (LLMs), exemplified by Visual Language Models (VLMs) such as Flamingo and GPT-4.

Privacy-Preserving In-Context Learning for Large Language Models

no code implementations2 May 2023 Tong Wu, Ashwinee Panda, Jiachen T. Wang, Prateek Mittal

Based on the general paradigm of DP-ICL, we instantiate several techniques showing how to privatize ICL for text classification and language generation.

In-Context Learning Privacy Preserving +3

A Randomized Approach for Tight Privacy Accounting

no code implementations17 Apr 2023 Jiachen T. Wang, Saeed Mahloujifar, Tong Wu, Ruoxi Jia, Prateek Mittal

In this paper, we propose a new differential privacy paradigm called estimate-verify-release (EVR), which addresses the challenges of providing a strict upper bound for privacy parameter in DP compositions by converting an estimate of privacy parameter into a formal guarantee.

Privacy Preserving

MultiRobustBench: Benchmarking Robustness Against Multiple Attacks

no code implementations21 Feb 2023 Sihui Dai, Saeed Mahloujifar, Chong Xiang, Vikash Sehwag, Pin-Yu Chen, Prateek Mittal

Using our framework, we present the first leaderboard, MultiRobustBench, for benchmarking multiattack evaluation which captures performance across attack types and attack strengths.

Benchmarking

Augmenting Rule-based DNS Censorship Detection at Scale with Machine Learning

1 code implementation3 Feb 2023 Jacob Brown, Xi Jiang, Van Tran, Arjun Nitin Bhagoji, Nguyen Phong Hoang, Nick Feamster, Prateek Mittal, Vinod Yegneswaran

In this paper, we explore how machine learning (ML) models can (1) help streamline the detection process, (2) improve the potential of using large-scale datasets for censorship detection, and (3) discover new censorship instances and blocking signatures missed by existing heuristic methods.

Blocking

Revisiting the Assumption of Latent Separability for Backdoor Defenses

1 code implementation ICLR 2023 Xiangyu Qi, Tinghao Xie, Tinghao_Xie1, Yiming Li, Saeed Mahloujifar, Prateek Mittal

This latent separation is so pervasive that a family of backdoor defenses directly take it as a default assumption (dubbed latent separability assumption), based on which to identify poison samples via cluster analysis in the latent space.

Uncovering Adversarial Risks of Test-Time Adaptation

no code implementations29 Jan 2023 Tong Wu, Feiran Jia, Xiangyu Qi, Jiachen T. Wang, Vikash Sehwag, Saeed Mahloujifar, Prateek Mittal

Recently, test-time adaptation (TTA) has been proposed as a promising solution for addressing distribution shifts.

Test-time Adaptation

DP-RAFT: A Differentially Private Recipe for Accelerated Fine-Tuning

no code implementations8 Dec 2022 Ashwinee Panda, Xinyu Tang, Vikash Sehwag, Saeed Mahloujifar, Prateek Mittal

A major direction in differentially private machine learning is differentially private fine-tuning: pretraining a model on a source of "public data" and transferring the extracted features to downstream tasks.

Image Classification

Renyi Differential Privacy of Propose-Test-Release and Applications to Private and Robust Machine Learning

no code implementations16 Sep 2022 Jiachen T. Wang, Saeed Mahloujifar, Shouda Wang, Ruoxi Jia, Prateek Mittal

As an application of our analysis, we show that PTR and our theoretical results can be used to design differentially private variants for byzantine robust training algorithms that use robust statistics for gradients aggregation.

A Light Recipe to Train Robust Vision Transformers

1 code implementation15 Sep 2022 Edoardo Debenedetti, Vikash Sehwag, Prateek Mittal

Additionally, investigating the reasons for the robustness of our models, we show that it is easier to generate strong attacks during training when using our recipe and that this leads to better robustness at test time.

Adversarial Robustness Data Augmentation +1

Just Rotate it: Deploying Backdoor Attacks via Rotation Transformation

no code implementations22 Jul 2022 Tong Wu, Tianhao Wang, Vikash Sehwag, Saeed Mahloujifar, Prateek Mittal

Our attack can be easily deployed in the real world since it only requires rotating the object, as we show in both image classification and object detection applications.

Data Augmentation Image Classification +3

Understanding Robust Learning through the Lens of Representation Similarities

1 code implementation20 Jun 2022 Christian Cianfarani, Arjun Nitin Bhagoji, Vikash Sehwag, Ben Y. Zhao, Prateek Mittal, Haitao Zheng

Representation learning, i. e. the generation of representations useful for downstream applications, is a task of fundamental importance that underlies much of the success of deep neural networks (DNNs).

Representation Learning

Neurotoxin: Durable Backdoors in Federated Learning

2 code implementations12 Jun 2022 Zhengming Zhang, Ashwinee Panda, Linyue Song, Yaoqing Yang, Michael W. Mahoney, Joseph E. Gonzalez, Kannan Ramchandran, Prateek Mittal

In this type of attack, the goal of the attacker is to use poisoned updates to implant so-called backdoors into the learned model such that, at test time, the model's outputs can be fixed to a given target for certain inputs.

Backdoor Attack Federated Learning +1

Towards A Proactive ML Approach for Detecting Backdoor Poison Samples

2 code implementations26 May 2022 Xiangyu Qi, Tinghao Xie, Jiachen T. Wang, Tong Wu, Saeed Mahloujifar, Prateek Mittal

First, we uncover a post-hoc workflow underlying most prior work, where defenders passively allow the attack to proceed and then leverage the characteristics of the post-attacked model to uncover poison samples.

Circumventing Backdoor Defenses That Are Based on Latent Separability

1 code implementation26 May 2022 Xiangyu Qi, Tinghao Xie, Yiming Li, Saeed Mahloujifar, Prateek Mittal

This latent separation is so pervasive that a family of backdoor defenses directly take it as a default assumption (dubbed latent separability assumption), based on which to identify poison samples via cluster analysis in the latent space.

Formulating Robustness Against Unforeseen Attacks

1 code implementation28 Apr 2022 Sihui Dai, Saeed Mahloujifar, Prateek Mittal

Based on our generalization bound, we propose variation regularization (VR) which reduces variation of the feature extractor across the source threat model during training.

ObjectSeeker: Certifiably Robust Object Detection against Patch Hiding Attacks via Patch-agnostic Masking

1 code implementation3 Feb 2022 Chong Xiang, Alexander Valtchanov, Saeed Mahloujifar, Prateek Mittal

An attacker can use a single physically-realizable adversarial patch to make the object detector miss the detection of victim objects and undermine the functionality of object detection applications.

Autonomous Vehicles Object +2

SparseFed: Mitigating Model Poisoning Attacks in Federated Learning with Sparsification

1 code implementation12 Dec 2021 Ashwinee Panda, Saeed Mahloujifar, Arjun N. Bhagoji, Supriyo Chakraborty, Prateek Mittal

Federated learning is inherently vulnerable to model poisoning attacks because its decentralized nature allows attackers to participate with compromised devices.

Federated Learning Model Poisoning

Mitigating Membership Inference Attacks by Self-Distillation Through a Novel Ensemble Architecture

no code implementations15 Oct 2021 Xinyu Tang, Saeed Mahloujifar, Liwei Song, Virat Shejwalkar, Milad Nasr, Amir Houmansadr, Prateek Mittal

The goal of this work is to train ML models that have high membership privacy while largely preserving their utility; we therefore aim for an empirical membership privacy guarantee as opposed to the provable privacy guarantees provided by techniques like differential privacy, as such techniques are shown to deteriorate model utility.

Privacy Preserving

Parameterizing Activation Functions for Adversarial Robustness

no code implementations11 Oct 2021 Sihui Dai, Saeed Mahloujifar, Prateek Mittal

To address this, we analyze the direct impact of activation shape on robustness through PAFs and observe that activation shapes with positive outputs on negative inputs and with high finite curvature can increase robustness.

Adversarial Robustness

PatchCleanser: Certifiably Robust Defense against Adversarial Patches for Any Image Classifier

1 code implementation20 Aug 2021 Chong Xiang, Saeed Mahloujifar, Prateek Mittal

Remarkably, PatchCleanser achieves 83. 9% top-1 clean accuracy and 62. 1% top-1 certified robust accuracy against a 2%-pixel square patch anywhere on the image for the 1000-class ImageNet dataset.

Image Classification

PatchGuard++: Efficient Provable Attack Detection against Adversarial Patches

1 code implementation26 Apr 2021 Chong Xiang, Prateek Mittal

Recent provably robust defenses generally follow the PatchGuard framework by using CNNs with small receptive fields and secure feature aggregation for robust model predictions.

Lower Bounds on Cross-Entropy Loss in the Presence of Test-time Adversaries

1 code implementation16 Apr 2021 Arjun Nitin Bhagoji, Daniel Cullina, Vikash Sehwag, Prateek Mittal

In particular, it is critical to determine classifier-agnostic bounds on the training loss to establish when learning is possible.

DetectorGuard: Provably Securing Object Detectors against Localized Patch Hiding Attacks

1 code implementation5 Feb 2021 Chong Xiang, Prateek Mittal

In this paper, we propose DetectorGuard as the first general framework for building provably robust object detectors against localized patch hiding attacks.

Image Classification Object +2

Enabling Efficient Cyber Threat Hunting With Cyber Threat Intelligence

1 code implementation26 Oct 2020 Peng Gao, Fei Shao, Xiaoyuan Liu, Xusheng Xiao, Zheng Qin, Fengyuan Xu, Prateek Mittal, Sanjeev R. Kulkarni, Dawn Song

Log-based cyber threat hunting has emerged as an important solution to counter sophisticated attacks.

RobustBench: a standardized adversarial robustness benchmark

1 code implementation19 Oct 2020 Francesco Croce, Maksym Andriushchenko, Vikash Sehwag, Edoardo Debenedetti, Nicolas Flammarion, Mung Chiang, Prateek Mittal, Matthias Hein

As a research community, we are still lacking a systematic understanding of the progress on adversarial robustness which often makes it hard to identify the most promising ideas in training robust models.

Adversarial Robustness Benchmarking +3

A Critical Evaluation of Open-World Machine Learning

no code implementations8 Jul 2020 Liwei Song, Vikash Sehwag, Arjun Nitin Bhagoji, Prateek Mittal

With our evaluation across 6 OOD detectors, we find that the choice of in-distribution data, model architecture and OOD data have a strong impact on OOD detection performance, inducing false positive rates in excess of $70\%$.

BIG-bench Machine Learning Out of Distribution (OOD) Detection

Time for a Background Check! Uncovering the impact of Background Features on Deep Neural Networks

no code implementations24 Jun 2020 Vikash Sehwag, Rajvardhan Oak, Mung Chiang, Prateek Mittal

With increasing expressive power, deep neural networks have significantly improved the state-of-the-art on image classification datasets, such as ImageNet.

Image Classification

PatchGuard: A Provably Robust Defense against Adversarial Patches via Small Receptive Fields and Masking

2 code implementations17 May 2020 Chong Xiang, Arjun Nitin Bhagoji, Vikash Sehwag, Prateek Mittal

In this paper, we propose a general defense framework called PatchGuard that can achieve high provable robustness while maintaining high clean accuracy against localized adversarial patches.

FALCON: Honest-Majority Maliciously Secure Framework for Private Deep Learning

1 code implementation5 Apr 2020 Sameer Wagh, Shruti Tople, Fabrice Benhamouda, Eyal Kushilevitz, Prateek Mittal, Tal Rabin

For private training, we are about 6x faster than SecureNN, 4. 4x faster than ABY3 and about 2-60x more communication efficient.

Systematic Evaluation of Privacy Risks of Machine Learning Models

1 code implementation24 Mar 2020 Liwei Song, Prateek Mittal

Machine learning models are prone to memorizing sensitive data, making them vulnerable to membership inference attacks in which an adversary aims to guess if an input sample was used to train the model.

BIG-bench Machine Learning Inference Attack

Towards Probabilistic Verification of Machine Unlearning

1 code implementation9 Mar 2020 David Marco Sommer, Liwei Song, Sameer Wagh, Prateek Mittal

In this work, we take the first step in proposing a formal framework to study the design of such verification mechanisms for data deletion requests -- also known as machine unlearning -- in the context of systems that provide machine learning as a service (MLaaS).

backdoor defense Machine Unlearning +1

HYDRA: Pruning Adversarially Robust Neural Networks

4 code implementations NeurIPS 2020 Vikash Sehwag, Shiqi Wang, Prateek Mittal, Suman Jana

We demonstrate that our approach, titled HYDRA, achieves compressed networks with state-of-the-art benign and robust accuracy, simultaneously.

Network Pruning

Lower Bounds on Adversarial Robustness from Optimal Transport

1 code implementation NeurIPS 2019 Arjun Nitin Bhagoji, Daniel Cullina, Prateek Mittal

In this paper, we use optimal transport to characterize the minimum possible loss in an adversarial classification scenario.

Adversarial Robustness Classification +1

Towards Compact and Robust Deep Neural Networks

no code implementations14 Jun 2019 Vikash Sehwag, Shiqi Wang, Prateek Mittal, Suman Jana

In this work, we rigorously study the extension of network pruning strategies to preserve both benign accuracy and robustness of a network.

Adversarial Robustness Network Pruning

Privacy Risks of Securing Machine Learning Models against Adversarial Examples

1 code implementation24 May 2019 Liwei Song, Reza Shokri, Prateek Mittal

To perform the membership inference attacks, we leverage the existing inference methods that exploit model predictions.

Adversarial Defense BIG-bench Machine Learning +1

Better the Devil you Know: An Analysis of Evasion Attacks using Out-of-Distribution Adversarial Examples

no code implementations5 May 2019 Vikash Sehwag, Arjun Nitin Bhagoji, Liwei Song, Chawin Sitawarin, Daniel Cullina, Mung Chiang, Prateek Mittal

A large body of recent work has investigated the phenomenon of evasion attacks using adversarial examples for deep learning systems, where the addition of norm-bounded perturbations to the test inputs leads to incorrect output classification.

Autonomous Driving General Classification

PAC-learning in the presence of adversaries

no code implementations NeurIPS 2018 Daniel Cullina, Arjun Nitin Bhagoji, Prateek Mittal

We then explicitly derive the adversarial VC-dimension for halfspace classifiers in the presence of a sample-wise norm-constrained adversary of the type commonly studied for evasion attacks and show that it is the same as the standard VC-dimension, closing an open question.

Open-Ended Question Answering PAC learning

Analyzing Federated Learning through an Adversarial Lens

2 code implementations ICLR 2019 Arjun Nitin Bhagoji, Supriyo Chakraborty, Prateek Mittal, Seraphin Calo

Federated learning distributes model training among a multitude of agents, who, guided by privacy concerns, perform training using their local data but share only model parameter updates, for iterative aggregation at the server.

Federated Learning Model Poisoning

Robust Website Fingerprinting Through the Cache Occupancy Channel

no code implementations17 Nov 2018 Anatoly Shusterman, Lachlan Kang, Yarden Haskal, Yosef Meltser, Prateek Mittal, Yossi Oren, Yuval Yarom

In this work we investigate these attacks under a different attack model, in which the adversary is capable of running a small amount of unprivileged code on the target user's computer.

Website Fingerprinting Attacks

Partial Recovery of Erdős-Rényi Graph Alignment via $k$-Core Alignment

no code implementations10 Sep 2018 Daniel Cullina, Negar Kiyavash, Prateek Mittal, H. Vincent Poor

This estimator searches for an alignment in which the intersection of the correlated graphs using this alignment has a minimum degree of $k$.

SAQL: A Stream-based Query System for Real-Time Abnormal System Behavior Detection

1 code implementation25 Jun 2018 Peng Gao, Xusheng Xiao, Ding Li, Zhichun Li, Kangkook Jee, Zhen-Yu Wu, Chung Hwan Kim, Sanjeev R. Kulkarni, Prateek Mittal

To facilitate the task of expressing anomalies based on expert knowledge, our system provides a domain-specific query language, SAQL, which allows analysts to express models for (1) rule-based anomalies, (2) time-series anomalies, (3) invariant-based anomalies, and (4) outlier-based anomalies.

Cryptography and Security Databases

PAC-learning in the presence of evasion adversaries

no code implementations5 Jun 2018 Daniel Cullina, Arjun Nitin Bhagoji, Prateek Mittal

We then explicitly derive the adversarial VC-dimension for halfspace classifiers in the presence of a sample-wise norm-constrained adversary of the type commonly studied for evasion attacks and show that it is the same as the standard VC-dimension, closing an open question.

Open-Ended Question Answering PAC learning

A Differential Privacy Mechanism Design Under Matrix-Valued Query

1 code implementation26 Feb 2018 Thee Chanyaswad, Alex Dytso, H. Vincent Poor, Prateek Mittal

noise to each element of the matrix, this method is often sub-optimal as it forfeits an opportunity to exploit the structural characteristics typically associated with matrix analysis.

DARTS: Deceiving Autonomous Cars with Toxic Signs

1 code implementation18 Feb 2018 Chawin Sitawarin, Arjun Nitin Bhagoji, Arsalan Mosenia, Mung Chiang, Prateek Mittal

In this paper, we propose and examine security attacks against sign recognition systems for Deceiving Autonomous caRs with Toxic Signs (we call the proposed attacks DARTS).

Traffic Sign Recognition

Rogue Signs: Deceiving Traffic Sign Recognition with Malicious Ads and Logos

1 code implementation9 Jan 2018 Chawin Sitawarin, Arjun Nitin Bhagoji, Arsalan Mosenia, Prateek Mittal, Mung Chiang

Our attack pipeline generates adversarial samples which are robust to the environmental conditions and noisy image transformations present in the physical world.

Traffic Sign Recognition

MVG Mechanism: Differential Privacy under Matrix-Valued Query

no code implementations2 Jan 2018 Thee Chanyaswad, Alex Dytso, H. Vincent Poor, Prateek Mittal

To address this challenge, we propose a novel differential privacy mechanism called the Matrix-Variate Gaussian (MVG) mechanism, which adds a matrix-valued noise drawn from a matrix-variate Gaussian distribution, and we rigorously prove that the MVG mechanism preserves $(\epsilon,\delta)$-differential privacy.

Coupling Random Orthonormal Projection with Gaussian Generative Model for Non-Interactive Private Data Release

1 code implementation31 Aug 2017 Thee Chanyaswad, Changchang Liu, Prateek Mittal

A key challenge facing the design of differential privacy in the non-interactive setting is to maintain the utility of the released data.

Cryptography and Security

Inaudible Voice Commands

1 code implementation24 Aug 2017 Liwei Song, Prateek Mittal

Voice assistants like Siri enable us to control IoT devices conveniently with voice commands, however, they also provide new attack opportunities for adversaries.

Cryptography and Security

On the Simultaneous Preservation of Privacy and Community Structure in Anonymized Networks

no code implementations25 Mar 2016 Daniel Cullina, Kushagra Singhal, Negar Kiyavash, Prateek Mittal

We ask the question "Does there exist a regime where the network cannot be deanonymized perfectly, yet the community structure could be learned?."

Community Detection Stochastic Block Model

Cannot find the paper you are looking for? You can Submit a new open access paper.