Search Results for author: Ahmed Salem

Found 23 papers, 6 papers with code

ML-Leaks: Model and Data Independent Membership Inference Attacks and Defenses on Machine Learning Models

7 code implementations4 Jun 2018 Ahmed Salem, Yang Zhang, Mathias Humbert, Pascal Berrang, Mario Fritz, Michael Backes

In addition, we propose the first effective defense mechanisms against such broader class of membership inference attacks that maintain a high level of utility of the ML model.

BIG-bench Machine Learning Inference Attack +1

ML-Doctor: Holistic Risk Assessment of Inference Attacks Against Machine Learning Models

1 code implementation4 Feb 2021 Yugeng Liu, Rui Wen, Xinlei He, Ahmed Salem, Zhikun Zhang, Michael Backes, Emiliano De Cristofaro, Mario Fritz, Yang Zhang

As a result, we lack a comprehensive picture of the risks caused by the attacks, e. g., the different scenarios they can be applied to, the common factors that influence their performance, the relationship among them, or the effectiveness of possible defenses.

Attribute BIG-bench Machine Learning +3

Analyzing Leakage of Personally Identifiable Information in Language Models

1 code implementation1 Feb 2023 Nils Lukas, Ahmed Salem, Robert Sim, Shruti Tople, Lukas Wutschitz, Santiago Zanella-Béguelin

Understanding the risk of LMs leaking Personally Identifiable Information (PII) has received less attention, which can be attributed to the false assumption that dataset curation techniques such as scrubbing are sufficient to prevent PII leakage.

Sentence

MemGuard: Defending against Black-Box Membership Inference Attacks via Adversarial Examples

3 code implementations23 Sep 2019 Jinyuan Jia, Ahmed Salem, Michael Backes, Yang Zhang, Neil Zhenqiang Gong

Specifically, given a black-box access to the target classifier, the attacker trains a binary classifier, which takes a data sample's confidence score vector predicted by the target classifier as an input and predicts the data sample to be a member or non-member of the target classifier's training dataset.

Inference Attack Membership Inference Attack

Bayesian Estimation of Differential Privacy

1 code implementation10 Jun 2022 Santiago Zanella-Béguelin, Lukas Wutschitz, Shruti Tople, Ahmed Salem, Victor Rühle, Andrew Paverd, Mohammad Naseri, Boris Köpf, Daniel Jones

Our Bayesian method exploits the hypothesis testing interpretation of differential privacy to obtain a posterior for $\varepsilon$ (not just a confidence interval) from the joint posterior of the false positive and false negative rates of membership inference attacks.

UnGANable: Defending Against GAN-based Face Manipulation

1 code implementation3 Oct 2022 Zheng Li, Ning Yu, Ahmed Salem, Michael Backes, Mario Fritz, Yang Zhang

Extensive experiments on four popular GAN models trained on two benchmark face datasets show that UnGANable achieves remarkable effectiveness and utility performance, and outperforms multiple baseline methods.

Face Swapping Misinformation

Updates-Leak: Data Set Inference and Reconstruction Attacks in Online Learning

no code implementations1 Apr 2019 Ahmed Salem, Apratim Bhattacharya, Michael Backes, Mario Fritz, Yang Zhang

As data generation is a continuous process, this leads to ML model owners updating their models frequently with newly-collected data in an online learning scenario.

Dynamic Backdoor Attacks Against Machine Learning Models

no code implementations7 Mar 2020 Ahmed Salem, Rui Wen, Michael Backes, Shiqing Ma, Yang Zhang

Triggers generated by our techniques can have random patterns and locations, which reduce the efficacy of the current backdoor detection mechanisms.

Backdoor Attack BIG-bench Machine Learning

BadNL: Backdoor Attacks against NLP Models with Semantic-preserving Improvements

no code implementations1 Jun 2020 Xiaoyi Chen, Ahmed Salem, Dingfan Chen, Michael Backes, Shiqing Ma, Qingni Shen, Zhonghai Wu, Yang Zhang

In this paper, we perform a systematic investigation of backdoor attack on NLP models, and propose BadNL, a general NLP backdoor attack framework including novel attack methods.

Backdoor Attack BIG-bench Machine Learning +1

Dynamic Backdoor Attacks Against Deep Neural Networks

no code implementations1 Jan 2021 Ahmed Salem, Rui Wen, Michael Backes, Shiqing Ma, Yang Zhang

In particular, BaN and c-BaN based on a novel generative network are the first two schemes that algorithmically generate triggers.

Don't Trigger Me! A Triggerless Backdoor Attack Against Deep Neural Networks

no code implementations7 Oct 2020 Ahmed Salem, Michael Backes, Yang Zhang

In this paper, we present the first triggerless backdoor attack against deep neural networks, where the adversary does not need to modify the input for triggering the backdoor.

Backdoor Attack

Get a Model! Model Hijacking Attack Against Machine Learning Models

no code implementations8 Nov 2021 Ahmed Salem, Michael Backes, Yang Zhang

In this work, we propose a new training time attack against computer vision based machine learning models, namely model hijacking attack.

Autonomous Driving BIG-bench Machine Learning +1

BadNL: Backdoor Attacks Against NLP Models

no code implementations ICML Workshop AML 2021 Xiaoyi Chen, Ahmed Salem, Michael Backes, Shiqing Ma, Yang Zhang

For instance, using the Word-level triggers, our backdoor attack achieves a 100% attack success rate with only a utility drop of 0. 18%, 1. 26%, and 0. 19% on three benchmark sentiment analysis datasets.

Backdoor Attack Sentence +1

Two-in-One: A Model Hijacking Attack Against Text Generation Models

no code implementations12 May 2023 Wai Man Si, Michael Backes, Yang Zhang, Ahmed Salem

In this work, we broaden the scope of this attack to include text generation and classification models, hence showing its broader applicability.

Face Recognition Image Classification +7

Deconstructing Classifiers: Towards A Data Reconstruction Attack Against Text Classification Models

no code implementations23 Jun 2023 Adel Elmahdy, Ahmed Salem

In this work, we propose a new targeted data reconstruction attack called the Mix And Match attack, which takes advantage of the fact that most classification models are based on LLM.

Reconstruction Attack text-classification +1

Last One Standing: A Comparative Analysis of Security and Privacy of Soft Prompt Tuning, LoRA, and In-Context Learning

no code implementations17 Oct 2023 Rui Wen, Tianhao Wang, Michael Backes, Yang Zhang, Ahmed Salem

Large Language Models (LLMs) are powerful tools for natural language processing, enabling novel applications and user experiences.

In-Context Learning

Rethinking Privacy in Machine Learning Pipelines from an Information Flow Control Perspective

no code implementations27 Nov 2023 Lukas Wutschitz, Boris Köpf, Andrew Paverd, Saravan Rajmohan, Ahmed Salem, Shruti Tople, Santiago Zanella-Béguelin, Menglin Xia, Victor Rühle

In this paper, we take an information flow control perspective to describe machine learning systems, which allows us to leverage metadata such as access control policies and define clear-cut privacy and confidentiality guarantees with interpretable information flows.

Retrieval

Comprehensive Assessment of Toxicity in ChatGPT

no code implementations3 Nov 2023 Boyang Zhang, Xinyue Shen, Wai Man Si, Zeyang Sha, Zeyuan Chen, Ahmed Salem, Yun Shen, Michael Backes, Yang Zhang

Moderating offensive, hateful, and toxic language has always been an important but challenging topic in the domain of safe use in NLP.

Maatphor: Automated Variant Analysis for Prompt Injection Attacks

no code implementations12 Dec 2023 Ahmed Salem, Andrew Paverd, Boris Köpf

This tool can also assist in generating datasets for jailbreak and prompt injection attacks, thus overcoming the scarcity of data in this domain.

Cannot find the paper you are looking for? You can Submit a new open access paper.