Search Results for author: Jonathan Hayase

Found 9 papers, 4 papers with code

Query-Based Adversarial Prompt Generation

no code implementations19 Feb 2024 Jonathan Hayase, Ema Borevkovic, Nicholas Carlini, Florian Tramèr, Milad Nasr

Recent work has shown it is possible to construct adversarial examples that cause an aligned language model to emit harmful strings or perform harmful behavior.

Language Modelling

Scalable Extraction of Training Data from (Production) Language Models

no code implementations28 Nov 2023 Milad Nasr, Nicholas Carlini, Jonathan Hayase, Matthew Jagielski, A. Feder Cooper, Daphne Ippolito, Christopher A. Choquette-Choo, Eric Wallace, Florian Tramèr, Katherine Lee

This paper studies extractable memorization: training data that an adversary can efficiently extract by querying a machine learning model without prior knowledge of the training dataset.

Chatbot Memorization

Zonotope Domains for Lagrangian Neural Network Verification

no code implementations14 Oct 2022 Matt Jordan, Jonathan Hayase, Alexandros G. Dimakis, Sewoong Oh

Neural network verification aims to provide provable bounds for the output of a neural network for a given input range.

Few-shot Backdoor Attacks via Neural Tangent Kernels

no code implementations12 Oct 2022 Jonathan Hayase, Sewoong Oh

In a backdoor attack, an attacker injects corrupted examples into the training set.

Backdoor Attack Bilevel Optimization

Git Re-Basin: Merging Models modulo Permutation Symmetries

3 code implementations11 Sep 2022 Samuel K. Ainsworth, Jonathan Hayase, Siddhartha Srinivasa

The success of deep learning is due in large part to our ability to solve certain massive non-convex optimization problems with relative ease.

Linear Mode Connectivity Re-basin

Towards a Defense Against Federated Backdoor Attacks Under Continuous Training

1 code implementation24 May 2022 Shuaiqi Wang, Jonathan Hayase, Giulia Fanti, Sewoong Oh

We propose shadow learning, a framework for defending against backdoor attacks in the FL setting under long-range training.

Continual Learning Federated Learning

SPECTRE: Defending Against Backdoor Attacks Using Robust Statistics

1 code implementation22 Apr 2021 Jonathan Hayase, Weihao Kong, Raghav Somani, Sewoong Oh

There have been promising attempts to use the intermediate representations of such a model to separate corrupted examples from clean ones.

The Futility of Bias-Free Learning and Search

no code implementations13 Jul 2019 George D. Montanez, Jonathan Hayase, Julius Lauw, Dominique Macias, Akshay Trikha, Julia Vendemiatti

For a given degree of bias towards a fixed target, we show that the proportion of favorable information resources is strictly bounded from above.

Cannot find the paper you are looking for? You can Submit a new open access paper.