Search Results for author: Matthew Jagielski

Found 33 papers, 12 papers with code

Beyond the Mean: Differentially Private Prototypes for Private Transfer Learning

no code implementations12 Jun 2024 Dariush Wahdany, Matthew Jagielski, Adam Dziedzic, Franziska Boenisch

We additionally show that privacy-utility trade-offs can be further improved when leveraging the public data beyond pre-training of the encoder: in particular, we can privately sample our DP prototypes from the publicly available data points used to train the encoder.

Transfer Learning

Phantom: General Trigger Attacks on Retrieval Augmented Language Generation

no code implementations30 May 2024 Harsh Chaudhari, Giorgio Severi, John Abascal, Matthew Jagielski, Christopher A. Choquette-Choo, Milad Nasr, Cristina Nita-Rotaru, Alina Oprea

In this work, we propose new attack surfaces for an adversary to compromise a victim's RAG system, by injecting a single malicious document in its knowledge database.

Chatbot RAG +2

Noise Masking Attacks and Defenses for Pretrained Speech Models

no code implementations2 Apr 2024 Matthew Jagielski, Om Thakkar, Lun Wang

Our method fine-tunes the encoder to produce an ASR model, and then performs noise masking on this model, which we find recovers private information from the pretraining data, despite the model never having seen transcripts at pretraining time!

Automatic Speech Recognition Automatic Speech Recognition (ASR) +1

Scalable Extraction of Training Data from (Production) Language Models

no code implementations28 Nov 2023 Milad Nasr, Nicholas Carlini, Jonathan Hayase, Matthew Jagielski, A. Feder Cooper, Daphne Ippolito, Christopher A. Choquette-Choo, Eric Wallace, Florian Tramèr, Katherine Lee

This paper studies extractable memorization: training data that an adversary can efficiently extract by querying a machine learning model without prior knowledge of the training dataset.

Chatbot Memorization

Privacy Side Channels in Machine Learning Systems

no code implementations11 Sep 2023 Edoardo Debenedetti, Giorgio Severi, Nicholas Carlini, Christopher A. Choquette-Choo, Matthew Jagielski, Milad Nasr, Eric Wallace, Florian Tramèr

Most current approaches for protecting privacy in machine learning (ML) assume that models exist in a vacuum, when in reality, ML models are part of larger systems that include components for training data filtering, output monitoring, and more.

Are aligned neural networks adversarially aligned?

no code implementations NeurIPS 2023 Nicholas Carlini, Milad Nasr, Christopher A. Choquette-Choo, Matthew Jagielski, Irena Gao, Anas Awadalla, Pang Wei Koh, Daphne Ippolito, Katherine Lee, Florian Tramer, Ludwig Schmidt

We show that existing NLP-based optimization attacks are insufficiently powerful to reliably attack aligned text models: even when current NLP-based attacks fail, we can find adversarial inputs with brute force.

A Note On Interpreting Canary Exposure

no code implementations31 May 2023 Matthew Jagielski

Canary exposure, introduced in Carlini et al. is frequently used to empirically evaluate, or audit, the privacy of machine learning model training.

PaLM 2 Technical Report

1 code implementation17 May 2023 Rohan Anil, Andrew M. Dai, Orhan Firat, Melvin Johnson, Dmitry Lepikhin, Alexandre Passos, Siamak Shakeri, Emanuel Taropa, Paige Bailey, Zhifeng Chen, Eric Chu, Jonathan H. Clark, Laurent El Shafey, Yanping Huang, Kathy Meier-Hellstern, Gaurav Mishra, Erica Moreira, Mark Omernick, Kevin Robinson, Sebastian Ruder, Yi Tay, Kefan Xiao, Yuanzhong Xu, Yujing Zhang, Gustavo Hernandez Abrego, Junwhan Ahn, Jacob Austin, Paul Barham, Jan Botha, James Bradbury, Siddhartha Brahma, Kevin Brooks, Michele Catasta, Yong Cheng, Colin Cherry, Christopher A. Choquette-Choo, Aakanksha Chowdhery, Clément Crepy, Shachi Dave, Mostafa Dehghani, Sunipa Dev, Jacob Devlin, Mark Díaz, Nan Du, Ethan Dyer, Vlad Feinberg, Fangxiaoyu Feng, Vlad Fienber, Markus Freitag, Xavier Garcia, Sebastian Gehrmann, Lucas Gonzalez, Guy Gur-Ari, Steven Hand, Hadi Hashemi, Le Hou, Joshua Howland, Andrea Hu, Jeffrey Hui, Jeremy Hurwitz, Michael Isard, Abe Ittycheriah, Matthew Jagielski, Wenhao Jia, Kathleen Kenealy, Maxim Krikun, Sneha Kudugunta, Chang Lan, Katherine Lee, Benjamin Lee, Eric Li, Music Li, Wei Li, Yaguang Li, Jian Li, Hyeontaek Lim, Hanzhao Lin, Zhongtao Liu, Frederick Liu, Marcello Maggioni, Aroma Mahendru, Joshua Maynez, Vedant Misra, Maysam Moussalem, Zachary Nado, John Nham, Eric Ni, Andrew Nystrom, Alicia Parrish, Marie Pellat, Martin Polacek, Alex Polozov, Reiner Pope, Siyuan Qiao, Emily Reif, Bryan Richter, Parker Riley, Alex Castro Ros, Aurko Roy, Brennan Saeta, Rajkumar Samuel, Renee Shelby, Ambrose Slone, Daniel Smilkov, David R. So, Daniel Sohn, Simon Tokumine, Dasha Valter, Vijay Vasudevan, Kiran Vodrahalli, Xuezhi Wang, Pidong Wang, ZiRui Wang, Tao Wang, John Wieting, Yuhuai Wu, Kelvin Xu, Yunhan Xu, Linting Xue, Pengcheng Yin, Jiahui Yu, Qiao Zhang, Steven Zheng, Ce Zheng, Weikang Zhou, Denny Zhou, Slav Petrov, Yonghui Wu

Through extensive evaluations on English and multilingual language, and reasoning tasks, we demonstrate that PaLM 2 has significantly improved quality on downstream tasks across different model sizes, while simultaneously exhibiting faster and more efficient inference compared to PaLM.

Code Generation Common Sense Reasoning +6

Tight Auditing of Differentially Private Machine Learning

no code implementations15 Feb 2023 Milad Nasr, Jamie Hayes, Thomas Steinke, Borja Balle, Florian Tramèr, Matthew Jagielski, Nicholas Carlini, Andreas Terzis

Moreover, our auditing scheme requires only two training runs (instead of thousands) to produce tight privacy estimates, by adapting recent advances in tight composition theorems for differential privacy.

Federated Learning

Extracting Training Data from Diffusion Models

1 code implementation30 Jan 2023 Nicholas Carlini, Jamie Hayes, Milad Nasr, Matthew Jagielski, Vikash Sehwag, Florian Tramèr, Borja Balle, Daphne Ippolito, Eric Wallace

Image diffusion models such as DALL-E 2, Imagen, and Stable Diffusion have attracted significant attention due to their ability to generate high-quality synthetic images.

Privacy Preserving

Preventing Verbatim Memorization in Language Models Gives a False Sense of Privacy

no code implementations31 Oct 2022 Daphne Ippolito, Florian Tramèr, Milad Nasr, Chiyuan Zhang, Matthew Jagielski, Katherine Lee, Christopher A. Choquette-Choo, Nicholas Carlini

Studying data memorization in neural language models helps us understand the risks (e. g., to privacy or copyright) associated with models regurgitating training data and aids in the development of countermeasures.

Memorization Open-Ended Question Answering +1

Network-Level Adversaries in Federated Learning

1 code implementation27 Aug 2022 Giorgio Severi, Matthew Jagielski, Gökberk Yar, Yuxuan Wang, Alina Oprea, Cristina Nita-Rotaru

Federated learning is a popular strategy for training models on distributed, sensitive data, while preserving data privacy.

Federated Learning

SNAP: Efficient Extraction of Private Properties with Poisoning

1 code implementation25 Aug 2022 Harsh Chaudhari, John Abascal, Alina Oprea, Matthew Jagielski, Florian Tramèr, Jonathan Ullman

Property inference attacks allow an adversary to extract global properties of the training dataset from a machine learning model.

Inference Attack

SafeNet: The Unreasonable Effectiveness of Ensembles in Private Collaborative Learning

no code implementations20 May 2022 Harsh Chaudhari, Matthew Jagielski, Alina Oprea

Secure multiparty computation (MPC) has been proposed to allow multiple mutually distrustful data owners to jointly train machine learning (ML) models on their combined data.

Backdoor Attack BIG-bench Machine Learning +2

How to Combine Membership-Inference Attacks on Multiple Updated Models

2 code implementations12 May 2022 Matthew Jagielski, Stanley Wu, Alina Oprea, Jonathan Ullman, Roxana Geambasu

Our results on four public datasets show that our attacks are effective at using update information to give the adversary a significant advantage over attacks on standalone models, but also compared to a prior MI attack that takes advantage of model updates in a related machine-unlearning setting.

Machine Unlearning

Subverting Fair Image Search with Generative Adversarial Perturbations

no code implementations5 May 2022 Avijit Ghosh, Matthew Jagielski, Christo Wilson

In this work we explore the intersection fairness and robustness in the context of ranking: when a ranking model has been calibrated to achieve some definition of fairness, is it possible for an external adversary to make the ranking model behave unfairly without having access to the model or training data?

Fairness Image Retrieval +1

Truth Serum: Poisoning Machine Learning Models to Reveal Their Secrets

no code implementations31 Mar 2022 Florian Tramèr, Reza Shokri, Ayrton San Joaquin, Hoang Le, Matthew Jagielski, Sanghyun Hong, Nicholas Carlini

We show that an adversary who can poison a training dataset can cause models trained on this dataset to leak significant private details of training points belonging to other parties.

Attribute BIG-bench Machine Learning

Debugging Differential Privacy: A Case Study for Privacy Auditing

no code implementations24 Feb 2022 Florian Tramer, Andreas Terzis, Thomas Steinke, Shuang Song, Matthew Jagielski, Nicholas Carlini

Differential Privacy can provide provable privacy guarantees for training data in machine learning.

Quantifying Memorization Across Neural Language Models

2 code implementations15 Feb 2022 Nicholas Carlini, Daphne Ippolito, Matthew Jagielski, Katherine Lee, Florian Tramer, Chiyuan Zhang

Large language models (LMs) have been shown to memorize parts of their training data, and when prompted appropriately, they will emit the memorized training data verbatim.

Fairness Memorization

Extracting Training Data from Large Language Models

3 code implementations14 Dec 2020 Nicholas Carlini, Florian Tramer, Eric Wallace, Matthew Jagielski, Ariel Herbert-Voss, Katherine Lee, Adam Roberts, Tom Brown, Dawn Song, Ulfar Erlingsson, Alina Oprea, Colin Raffel

We demonstrate our attack on GPT-2, a language model trained on scrapes of the public Internet, and are able to extract hundreds of verbatim text sequences from the model's training data.

Language Modelling

Subpopulation Data Poisoning Attacks

1 code implementation24 Jun 2020 Matthew Jagielski, Giorgio Severi, Niklas Pousette Harger, Alina Oprea

Poisoning attacks against machine learning induce adversarial modification of data used by a machine learning algorithm to selectively change its output when it is deployed.

BIG-bench Machine Learning Data Poisoning

Auditing Differentially Private Machine Learning: How Private is Private SGD?

1 code implementation NeurIPS 2020 Matthew Jagielski, Jonathan Ullman, Alina Oprea

We investigate whether Differentially Private SGD offers better privacy in practice than what is guaranteed by its state-of-the-art analysis.

Art Analysis BIG-bench Machine Learning +1

Cryptanalytic Extraction of Neural Network Models

1 code implementation10 Mar 2020 Nicholas Carlini, Matthew Jagielski, Ilya Mironov

We argue that the machine learning problem of model extraction is actually a cryptanalytic problem in disguise, and should be studied as such.

Model extraction

High Accuracy and High Fidelity Extraction of Neural Networks

no code implementations3 Sep 2019 Matthew Jagielski, Nicholas Carlini, David Berthelot, Alex Kurakin, Nicolas Papernot

In a model extraction attack, an adversary steals a copy of a remotely deployed machine learning model, given oracle prediction access.

Model extraction Vocal Bursts Intensity Prediction

Differentially Private Fair Learning

no code implementations6 Dec 2018 Matthew Jagielski, Michael Kearns, Jieming Mao, Alina Oprea, Aaron Roth, Saeed Sharifi-Malvajerdi, Jonathan Ullman

This algorithm is appealingly simple, but must be able to use protected group membership explicitly at test time, which can be viewed as a form of 'disparate treatment'.

Attribute Fairness

Why Do Adversarial Attacks Transfer? Explaining Transferability of Evasion and Poisoning Attacks

no code implementations8 Sep 2018 Ambra Demontis, Marco Melis, Maura Pintor, Matthew Jagielski, Battista Biggio, Alina Oprea, Cristina Nita-Rotaru, Fabio Roli

Transferability captures the ability of an attack against a machine-learning model to be effective against a different, potentially unknown, model.

Manipulating Machine Learning: Poisoning Attacks and Countermeasures for Regression Learning

1 code implementation1 Apr 2018 Matthew Jagielski, Alina Oprea, Battista Biggio, Chang Liu, Cristina Nita-Rotaru, Bo Li

As machine learning becomes widely used for automated decisions, attackers have strong incentives to manipulate the results and models generated by machine learning algorithms.

BIG-bench Machine Learning regression

Cannot find the paper you are looking for? You can Submit a new open access paper.