Search Results for author: Yangsibo Huang

Found 35 papers, 14 papers with code

Instance-hiding Schemes for Private Distributed Learning

no code implementations ICML 2020 Yangsibo Huang, Zhao Song, Sanjeev Arora, Kai Li

The new ideas in the current paper are: (a) new variants of mixup with negative as well as positive coefficients, and extend the sample-wise mixup to be pixel-wise.

Federated Learning

Scaling Laws for Differentially Private Language Models

no code implementations31 Jan 2025 Ryan McKenna, Yangsibo Huang, Amer Sinha, Borja Balle, Zachary Charles, Christopher A. Choquette-Choo, Badih Ghazi, George Kaissis, Ravi Kumar, Ruibo Liu, Da Yu, Chiyuan Zhang

Scaling laws have emerged as important components of large language model (LLM) training as they can predict performance gains through scale, and provide guidance on important hyper-parameter choices that would otherwise be expensive.

Language Modeling Language Modelling +1

Exploring and Mitigating Adversarial Manipulation of Voting-Based Leaderboards

no code implementations13 Jan 2025 Yangsibo Huang, Milad Nasr, Anastasios Angelopoulos, Nicholas Carlini, Wei-Lin Chiang, Christopher A. Choquette-Choo, Daphne Ippolito, Matthew Jagielski, Katherine Lee, Ken Ziyu Liu, Ion Stoica, Florian Tramer, Chiyuan Zhang

Our attack consists of two steps: first, we show how an attacker can determine which model was used to generate a given reply with more than $95\%$ accuracy; and then, the attacker can use this information to consistently vote for (or against) a target model.

Chatbot

On Evaluating the Durability of Safeguards for Open-Weight LLMs

1 code implementation10 Dec 2024 Xiangyu Qi, Boyi Wei, Nicholas Carlini, Yangsibo Huang, Tinghao Xie, Luxi He, Matthew Jagielski, Milad Nasr, Prateek Mittal, Peter Henderson

Through several case studies, we demonstrate that even evaluating these defenses is exceedingly difficult and can easily mislead audiences into thinking that safeguards are more durable than they really are.

On Memorization of Large Language Models in Logical Reasoning

no code implementations30 Oct 2024 Chulin Xie, Yangsibo Huang, Chiyuan Zhang, Da Yu, Xinyun Chen, Bill Yuchen Lin, Bo Li, Badih Ghazi, Ravi Kumar

In this paper, we systematically investigate this hypothesis with a quantitative measurement of memorization in reasoning tasks, using a dynamically generated logical reasoning benchmark based on Knights and Knaves (K&K) puzzles.

Logical Reasoning Memorization

An Adversarial Perspective on Machine Unlearning for AI Safety

1 code implementation26 Sep 2024 Jakub Łucki, Boyi Wei, Yangsibo Huang, Peter Henderson, Florian Tramèr, Javier Rando

Large language models are finetuned to refuse questions about hazardous knowledge, but these protections can often be bypassed.

Machine Unlearning

ConceptMix: A Compositional Image Generation Benchmark with Controllable Difficulty

no code implementations26 Aug 2024 Xindi Wu, Dingli Yu, Yangsibo Huang, Olga Russakovsky, Sanjeev Arora

Compositionality is a critical capability in Text-to-Image (T2I) models, as it reflects their ability to understand and combine multiple concepts from text descriptions.

Diversity Image Generation

Evaluating Copyright Takedown Methods for Language Models

no code implementations26 Jun 2024 Boyi Wei, Weijia Shi, Yangsibo Huang, Noah A. Smith, Chiyuan Zhang, Luke Zettlemoyer, Kai Li, Peter Henderson

Language models (LMs) derive their capabilities from extensive training on diverse data, including potentially copyrighted material.

Mind the Privacy Unit! User-Level Differential Privacy for Language Model Fine-Tuning

no code implementations20 Jun 2024 Lynn Chua, Badih Ghazi, Yangsibo Huang, Pritish Kamath, Ravi Kumar, Daogao Liu, Pasin Manurangsi, Amer Sinha, Chiyuan Zhang

Large language models (LLMs) have emerged as powerful tools for tackling complex tasks across diverse domains, but they also raise privacy concerns when fine-tuned on sensitive data due to potential memorization.

Language Modeling Language Modelling +2

Fantastic Copyrighted Beasts and How (Not) to Generate Them

1 code implementation20 Jun 2024 Luxi He, Yangsibo Huang, Weijia Shi, Tinghao Xie, Haotian Liu, Yue Wang, Luke Zettlemoyer, Chiyuan Zhang, Danqi Chen, Peter Henderson

We show that state-of-the-art image and video generation models can still generate characters even if characters' names are not explicitly mentioned, sometimes with only two generic keywords (e. g., prompting with "videogame, plumber" consistently generates Nintendo's Mario character).

Image Generation Video Generation

AI Risk Management Should Incorporate Both Safety and Security

no code implementations29 May 2024 Xiangyu Qi, Yangsibo Huang, Yi Zeng, Edoardo Debenedetti, Jonas Geiping, Luxi He, Kaixuan Huang, Udari Madhushani, Vikash Sehwag, Weijia Shi, Boyi Wei, Tinghao Xie, Danqi Chen, Pin-Yu Chen, Jeffrey Ding, Ruoxi Jia, Jiaqi Ma, Arvind Narayanan, Weijie J Su, Mengdi Wang, Chaowei Xiao, Bo Li, Dawn Song, Peter Henderson, Prateek Mittal

The exposure of security vulnerabilities in safety-aligned language models, e. g., susceptibility to adversarial attacks, has shed light on the intricate interplay between AI safety and AI security.

Management

Assessing the Brittleness of Safety Alignment via Pruning and Low-Rank Modifications

no code implementations7 Feb 2024 Boyi Wei, Kaixuan Huang, Yangsibo Huang, Tinghao Xie, Xiangyu Qi, Mengzhou Xia, Prateek Mittal, Mengdi Wang, Peter Henderson

We develop methods to identify critical regions that are vital for safety guardrails, and that are disentangled from utility-relevant regions at both the neuron and rank levels.

Safety Alignment

Detecting Pretraining Data from Large Language Models

1 code implementation25 Oct 2023 Weijia Shi, Anirudh Ajith, Mengzhou Xia, Yangsibo Huang, Daogao Liu, Terra Blevins, Danqi Chen, Luke Zettlemoyer

Min-K% Prob can be applied without any knowledge about the pretraining corpus or any additional training, departing from previous detection methods that require training a reference model on data that is similar to the pretraining data.

Machine Unlearning

Catastrophic Jailbreak of Open-source LLMs via Exploiting Generation

2 code implementations10 Oct 2023 Yangsibo Huang, Samyak Gupta, Mengzhou Xia, Kai Li, Danqi Chen

Finally, we propose an effective alignment method that explores diverse generation strategies, which can reasonably reduce the misalignment rate under our attack.

Red Teaming

Learning across Data Owners with Joint Differential Privacy

no code implementations25 May 2023 Yangsibo Huang, Haotian Jiang, Daogao Liu, Mohammad Mahdian, Jieming Mao, Vahab Mirrokni

In this paper, we study the setting in which data owners train machine learning models collaboratively under a privacy notion called joint differential privacy [Kearns et al., 2018].

Multi-class Classification

Privacy Implications of Retrieval-Based Language Models

1 code implementation24 May 2023 Yangsibo Huang, Samyak Gupta, Zexuan Zhong, Kai Li, Danqi Chen

Crucially, we find that $k$NN-LMs are more susceptible to leaking private information from their private datastore than parametric models.

Retrieval

GMValuator: Similarity-based Data Valuation for Generative Models

no code implementations21 Apr 2023 Jiaxi Yang, Wenglong Deng, Benlin Liu, Yangsibo Huang, James Zou, Xiaoxiao Li

Specifically, we introduce Generative Model Valuator (GMValuator), the first training-free and model-agnostic approach to provide data valuation for generation tasks.

Data Valuation Image Quality Assessment

Recovering Private Text in Federated Learning of Language Models

1 code implementation17 May 2022 Samyak Gupta, Yangsibo Huang, Zexuan Zhong, Tianyu Gao, Kai Li, Danqi Chen

For the first time, we show the feasibility of recovering text from large batch sizes of up to 128 sentences.

Federated Learning Word Embeddings

Evaluating Gradient Inversion Attacks and Defenses in Federated Learning

1 code implementation NeurIPS 2021 Yangsibo Huang, Samyak Gupta, Zhao Song, Kai Li, Sanjeev Arora

Gradient inversion attack (or input recovery from gradient) is an emerging threat to the security and privacy preservation of Federated learning, whereby malicious eavesdroppers or participants in the protocol can recover (partially) the clients' private data.

Federated Learning

EMA: Auditing Data Removal from Trained Models

1 code implementation8 Sep 2021 Yangsibo Huang, Xiaoxiao Li, Kai Li

In this paper, we propose a new method called Ensembled Membership Auditing (EMA) for auditing data removal to overcome these limitations.

MixCon: Adjusting the Separability of Data Representations for Harder Data Recovery

no code implementations22 Oct 2020 Xiaoxiao Li, Yangsibo Huang, Binghui Peng, Zhao Song, Kai Li

To address the issue that deep neural networks (DNNs) are vulnerable to model inversion attacks, we design an objective function, which adjusts the separability of the hidden data representations, as a way to control the trade-off between data utility and vulnerability to inversion attacks.

InstaHide: Instance-hiding Schemes for Private Distributed Learning

3 code implementations6 Oct 2020 Yangsibo Huang, Zhao Song, Kai Li, Sanjeev Arora

This paper introduces InstaHide, a simple encryption of training images, which can be plugged into existing distributed deep learning pipelines.

Deep Learning Based Detection and Localization of Intracranial Aneurysms in Computed Tomography Angiography

no code implementations22 May 2020 Dufan Wu, Daniel Montes, Ziheng Duan, Yangsibo Huang, Javier M. Romero, Ramon Gilberto Gonzalez, Quanzheng Li

Purpose: To develop CADIA, a supervised deep learning model based on a region proposal network coupled with a false-positive reduction module for the detection and localization of intracranial aneurysms (IA) from computed tomography angiography (CTA), and to assess our model's performance to a similar detection network.

Region Proposal Specificity

Privacy-preserving Learning via Deep Net Pruning

no code implementations4 Mar 2020 Yangsibo Huang, Yushan Su, Sachin Ravi, Zhao Song, Sanjeev Arora, Kai Li

This paper attempts to answer the question whether neural network pruning can be used as a tool to achieve differential privacy without losing much data utility.

Network Pruning Privacy Preserving

Deep Q Learning Driven CT Pancreas Segmentation with Geometry-Aware U-Net

no code implementations19 Apr 2019 Yunze Man, Yangsibo Huang, Junyi Feng, Xi Li, Fei Wu

Segmentation of pancreas is important for medical image analysis, yet it faces great challenges of class imbalance, background distractions and non-rigid geometrical features.

Medical Image Analysis Pancreas Segmentation +2

Cannot find the paper you are looking for? You can Submit a new open access paper.