Search Results for author: Florian Tramèr

Found 19 papers, 13 papers with code

Truth Serum: Poisoning Machine Learning Models to Reveal Their Secrets

no code implementations31 Mar 2022 Florian Tramèr, Reza Shokri, Ayrton San Joaquin, Hoang Le, Matthew Jagielski, Sanghyun Hong, Nicholas Carlini

We show that an adversary who can poison a training dataset can cause models trained on this dataset to leak significant private details of training points belonging to other parties.

What Does it Mean for a Language Model to Preserve Privacy?

no code implementations11 Feb 2022 Hannah Brown, Katherine Lee, FatemehSadat Mireshghallah, Reza Shokri, Florian Tramèr

Language models lack the ability to understand the context and sensitivity of text, and tend to memorize phrases present in their training sets.

Language Modelling

Counterfactual Memorization in Neural Language Models

no code implementations24 Dec 2021 Chiyuan Zhang, Daphne Ippolito, Katherine Lee, Matthew Jagielski, Florian Tramèr, Nicholas Carlini

Modern neural language models widely used in tasks across NLP risk memorizing sensitive information from their training data.

Large Language Models Can Be Strong Differentially Private Learners

1 code implementation ICLR 2022 Xuechen Li, Florian Tramèr, Percy Liang, Tatsunori Hashimoto

Differentially Private (DP) learning has seen limited success for building large deep learning models of text, and attempts at straightforwardly applying Differentially Private Stochastic Gradient Descent (DP-SGD) to NLP tasks have resulted in large performance drops and high computational overhead.

On the Opportunities and Risks of Foundation Models

no code implementations16 Aug 2021 Rishi Bommasani, Drew A. Hudson, Ehsan Adeli, Russ Altman, Simran Arora, Sydney von Arx, Michael S. Bernstein, Jeannette Bohg, Antoine Bosselut, Emma Brunskill, Erik Brynjolfsson, Shyamal Buch, Dallas Card, Rodrigo Castellon, Niladri Chatterji, Annie Chen, Kathleen Creel, Jared Quincy Davis, Dora Demszky, Chris Donahue, Moussa Doumbouya, Esin Durmus, Stefano Ermon, John Etchemendy, Kawin Ethayarajh, Li Fei-Fei, Chelsea Finn, Trevor Gale, Lauren Gillespie, Karan Goel, Noah Goodman, Shelby Grossman, Neel Guha, Tatsunori Hashimoto, Peter Henderson, John Hewitt, Daniel E. Ho, Jenny Hong, Kyle Hsu, Jing Huang, Thomas Icard, Saahil Jain, Dan Jurafsky, Pratyusha Kalluri, Siddharth Karamcheti, Geoff Keeling, Fereshte Khani, Omar Khattab, Pang Wei Kohd, Mark Krass, Ranjay Krishna, Rohith Kuditipudi, Ananya Kumar, Faisal Ladhak, Mina Lee, Tony Lee, Jure Leskovec, Isabelle Levent, Xiang Lisa Li, Xuechen Li, Tengyu Ma, Ali Malik, Christopher D. Manning, Suvir Mirchandani, Eric Mitchell, Zanele Munyikwa, Suraj Nair, Avanika Narayan, Deepak Narayanan, Ben Newman, Allen Nie, Juan Carlos Niebles, Hamed Nilforoshan, Julian Nyarko, Giray Ogut, Laurel Orr, Isabel Papadimitriou, Joon Sung Park, Chris Piech, Eva Portelance, Christopher Potts, aditi raghunathan, Rob Reich, Hongyu Ren, Frieda Rong, Yusuf Roohani, Camilo Ruiz, Jack Ryan, Christopher Ré, Dorsa Sadigh, Shiori Sagawa, Keshav Santhanam, Andy Shih, Krishnan Srinivasan, Alex Tamkin, Rohan Taori, Armin W. Thomas, Florian Tramèr, Rose E. Wang, William Wang, Bohan Wu, Jiajun Wu, Yuhuai Wu, Sang Michael Xie, Michihiro Yasunaga, Jiaxuan You, Matei Zaharia, Michael Zhang, Tianyi Zhang, Xikun Zhang, Yuhui Zhang, Lucia Zheng, Kaitlyn Zhou, Percy Liang

AI is undergoing a paradigm shift with the rise of models (e. g., BERT, DALL-E, GPT-3) that are trained on broad data at scale and are adaptable to a wide range of downstream tasks.

Transfer Learning

Detecting Adversarial Examples Is (Nearly) As Hard As Classifying Them

no code implementations24 Jul 2021 Florian Tramèr

We prove a general hardness reduction between detection and classification of adversarial examples: given a robust detector for attacks at distance {\epsilon} (in some metric), we can build a similarly robust (but inefficient) classifier for attacks at distance {\epsilon}/2.

Data Poisoning Won't Save You From Facial Recognition

1 code implementation28 Jun 2021 Evani Radiya-Dixit, Sanghyun Hong, Nicholas Carlini, Florian Tramèr

We demonstrate that this strategy provides a false sense of security, as it ignores an inherent asymmetry between the parties: users' pictures are perturbed once and for all before being published (at which point they are scraped) and must thereafter fool all future models -- including models trained adaptively against the users' past attacks, or models that use technologies discovered after the attack.

Data Poisoning

Antipodes of Label Differential Privacy: PATE and ALIBI

1 code implementation NeurIPS 2021 Mani Malek, Ilya Mironov, Karthik Prasad, Igor Shilov, Florian Tramèr

We propose two novel approaches based on, respectively, the Laplace mechanism and the PATE framework, and demonstrate their effectiveness on standard benchmarks.

Bayesian Inference Privacy Preserving Deep Learning

Differentially Private Learning Needs Better Features (or Much More Data)

2 code implementations ICLR 2021 Florian Tramèr, Dan Boneh

We demonstrate that differentially private machine learning has not yet reached its "AlexNet moment" on many canonical vision tasks: linear models trained on handcrafted features significantly outperform end-to-end deep neural networks for moderate privacy budgets.

Adversarial Training and Robustness for Multiple Perturbations

1 code implementation NeurIPS 2019 Florian Tramèr, Dan Boneh

Defenses against adversarial examples, such as adversarial training, are typically tailored to a single perturbation type (e. g., small $\ell_\infty$-noise).

Adversarial Robustness

SentiNet: Detecting Physical Attacks Against Deep Learning Systems

1 code implementation2 Dec 2018 Edward Chou, Florian Tramèr, Giancarlo Pellegrino, Dan Boneh

By leveraging the neural network's susceptibility to attacks and by using techniques from model interpretability and object detection as detection mechanisms, SentiNet turns a weakness of a model into a strength.

Cryptography and Security

AdVersarial: Perceptual Ad Blocking meets Adversarial Machine Learning

2 code implementations8 Nov 2018 Florian Tramèr, Pascal Dupré, Gili Rusak, Giancarlo Pellegrino, Dan Boneh

On the other, we present a concrete set of attacks on visual ad-blockers by constructing adversarial examples in a real web page context.

Slalom: Fast, Verifiable and Private Execution of Neural Networks in Trusted Hardware

1 code implementation ICLR 2019 Florian Tramèr, Dan Boneh

As Machine Learning (ML) gets applied to security-critical or sensitive domains, there is a growing need for integrity and privacy for outsourced ML computations.

Ensemble Adversarial Training: Attacks and Defenses

11 code implementations ICLR 2018 Florian Tramèr, Alexey Kurakin, Nicolas Papernot, Ian Goodfellow, Dan Boneh, Patrick McDaniel

We show that this form of adversarial training converges to a degenerate global minimum, wherein small curvature artifacts near the data points obfuscate a linear approximation of the loss.

The Space of Transferable Adversarial Examples

1 code implementation11 Apr 2017 Florian Tramèr, Nicolas Papernot, Ian Goodfellow, Dan Boneh, Patrick McDaniel

Adversarial examples are maliciously perturbed inputs designed to mislead machine learning (ML) models at test-time.

Stealing Machine Learning Models via Prediction APIs

1 code implementation9 Sep 2016 Florian Tramèr, Fan Zhang, Ari Juels, Michael K. Reiter, Thomas Ristenpart

In such attacks, an adversary with black-box access, but no prior knowledge of an ML model's parameters or training data, aims to duplicate the functionality of (i. e., "steal") the model.

Learning Theory Model extraction

Cannot find the paper you are looking for? You can Submit a new open access paper.