no code implementations • 23 Apr 2024 • Gavin Brown, Jonathan Hayase, Samuel Hopkins, Weihao Kong, Xiyang Liu, Sewoong Oh, Juan C. Perdomo, Adam Smith
We present a sample- and time-efficient differentially private algorithm for ordinary least squares, with error that depends linearly on the dimension and is independent of the condition number of $X^\top X$, where $X$ is the design matrix.
no code implementations • 19 Feb 2024 • Jonathan Hayase, Ema Borevkovic, Nicholas Carlini, Florian Tramèr, Milad Nasr
Recent work has shown it is possible to construct adversarial examples that cause an aligned language model to emit harmful strings or perform harmful behavior.
no code implementations • 28 Nov 2023 • Milad Nasr, Nicholas Carlini, Jonathan Hayase, Matthew Jagielski, A. Feder Cooper, Daphne Ippolito, Christopher A. Choquette-Choo, Eric Wallace, Florian Tramèr, Katherine Lee
This paper studies extractable memorization: training data that an adversary can efficiently extract by querying a machine learning model without prior knowledge of the training dataset.
1 code implementation • NeurIPS 2023 • Samir Yitzhak Gadre, Gabriel Ilharco, Alex Fang, Jonathan Hayase, Georgios Smyrnis, Thao Nguyen, Ryan Marten, Mitchell Wortsman, Dhruba Ghosh, Jieyu Zhang, Eyal Orgad, Rahim Entezari, Giannis Daras, Sarah Pratt, Vivek Ramanujan, Yonatan Bitton, Kalyani Marathe, Stephen Mussmann, Richard Vencu, Mehdi Cherti, Ranjay Krishna, Pang Wei Koh, Olga Saukh, Alexander Ratner, Shuran Song, Hannaneh Hajishirzi, Ali Farhadi, Romain Beaumont, Sewoong Oh, Alex Dimakis, Jenia Jitsev, Yair Carmon, Vaishaal Shankar, Ludwig Schmidt
Multimodal datasets are a critical component in recent breakthroughs such as Stable Diffusion and GPT-4, yet their design does not receive the same research attention as model architectures or training algorithms.
1 code implementation • 14 Oct 2022 • Matt Jordan, Jonathan Hayase, Alexandros G. Dimakis, Sewoong Oh
Neural network verification aims to provide provable bounds for the output of a neural network for a given input range.
no code implementations • 12 Oct 2022 • Jonathan Hayase, Sewoong Oh
In a backdoor attack, an attacker injects corrupted examples into the training set.
3 code implementations • 11 Sep 2022 • Samuel K. Ainsworth, Jonathan Hayase, Siddhartha Srinivasa
The success of deep learning is due in large part to our ability to solve certain massive non-convex optimization problems with relative ease.
1 code implementation • 24 May 2022 • Shuaiqi Wang, Jonathan Hayase, Giulia Fanti, Sewoong Oh
We propose shadow learning, a framework for defending against backdoor attacks in the FL setting under long-range training.
1 code implementation • 22 Apr 2021 • Jonathan Hayase, Weihao Kong, Raghav Somani, Sewoong Oh
There have been promising attempts to use the intermediate representations of such a model to separate corrupted examples from clean ones.
no code implementations • 13 Jul 2019 • George D. Montanez, Jonathan Hayase, Julius Lauw, Dominique Macias, Akshay Trikha, Julia Vendemiatti
For a given degree of bias towards a fixed target, we show that the proportion of favorable information resources is strictly bounded from above.