no code implementations • ICML 2020 • Yangsibo Huang, Zhao Song, Sanjeev Arora, Kai Li
The new ideas in the current paper are: (a) new variants of mixup with negative as well as positive coefficients, and extend the sample-wise mixup to be pixel-wise.
no code implementations • 7 Mar 2024 • Shayne Longpre, Sayash Kapoor, Kevin Klyman, Ashwin Ramaswami, Rishi Bommasani, Borhane Blili-Hamelin, Yangsibo Huang, Aviya Skowron, Zheng-Xin Yong, Suhas Kotha, Yi Zeng, Weiyan Shi, Xianjun Yang, Reid Southen, Alexander Robey, Patrick Chao, Diyi Yang, Ruoxi Jia, Daniel Kang, Sandy Pentland, Arvind Narayanan, Percy Liang, Peter Henderson
Independent evaluation and red teaming are critical for identifying the risks posed by generative AI systems.
no code implementations • 7 Feb 2024 • Boyi Wei, Kaixuan Huang, Yangsibo Huang, Tinghao Xie, Xiangyu Qi, Mengzhou Xia, Prateek Mittal, Mengdi Wang, Peter Henderson
We develop methods to identify critical regions that are vital for safety guardrails, and that are disentangled from utility-relevant regions at both the neuron and rank levels.
no code implementations • 25 Oct 2023 • Weijia Shi, Anirudh Ajith, Mengzhou Xia, Yangsibo Huang, Daogao Liu, Terra Blevins, Danqi Chen, Luke Zettlemoyer
Min-K% Prob can be applied without any knowledge about the pretraining corpus or any additional training, departing from previous detection methods that require training a reference model on data that is similar to the pretraining data.
2 code implementations • 10 Oct 2023 • Yangsibo Huang, Samyak Gupta, Mengzhou Xia, Kai Li, Danqi Chen
Finally, we propose an effective alignment method that explores diverse generation strategies, which can reasonably reduce the misalignment rate under our attack.
no code implementations • 25 May 2023 • Yangsibo Huang, Haotian Jiang, Daogao Liu, Mohammad Mahdian, Jieming Mao, Vahab Mirrokni
In this paper, we study the setting in which data owners train machine learning models collaboratively under a privacy notion called joint differential privacy [Kearns et al., 2018].
1 code implementation • 24 May 2023 • Yangsibo Huang, Samyak Gupta, Zexuan Zhong, Kai Li, Danqi Chen
Crucially, we find that $k$NN-LMs are more susceptible to leaking private information from their private datastore than parametric models.
no code implementations • 21 Apr 2023 • Jiaxi Yang, Wenglong Deng, Benlin Liu, Yangsibo Huang, Xiaoxiao Li
To the best of their knowledge, GMValuator is the first work that offers a training-free, post-hoc data valuation strategy for deep generative models.
no code implementations • 21 Feb 2023 • Yangsibo Huang, Daogao Liu, Zexuan Zhong, Weijia Shi, Yin Tat Lee
Fine-tuning a language model on a new domain is standard practice for domain adaptation.
1 code implementation • 17 May 2022 • Samyak Gupta, Yangsibo Huang, Zexuan Zhong, Tianyu Gao, Kai Li, Danqi Chen
For the first time, we show the feasibility of recovering text from large batch sizes of up to 128 sentences.
1 code implementation • NeurIPS 2021 • Yangsibo Huang, Samyak Gupta, Zhao Song, Kai Li, Sanjeev Arora
Gradient inversion attack (or input recovery from gradient) is an emerging threat to the security and privacy preservation of Federated learning, whereby malicious eavesdroppers or participants in the protocol can recover (partially) the clients' private data.
1 code implementation • 8 Sep 2021 • Yangsibo Huang, Xiaoxiao Li, Kai Li
In this paper, we propose a new method called Ensembled Membership Auditing (EMA) for auditing data removal to overcome these limitations.
no code implementations • 23 Dec 2020 • Wei Qiu, Yangsibo Huang, Quanzheng Li
Missing value imputation is a challenging and well-researched topic in data mining.
no code implementations • 22 Oct 2020 • Xiaoxiao Li, Yangsibo Huang, Binghui Peng, Zhao Song, Kai Li
To address the issue that deep neural networks (DNNs) are vulnerable to model inversion attacks, we design an objective function, which adjusts the separability of the hidden data representations, as a way to control the trade-off between data utility and vulnerability to inversion attacks.
1 code implementation • Findings of the Association for Computational Linguistics 2020 • Yangsibo Huang, Zhao Song, Danqi Chen, Kai Li, Sanjeev Arora
In addition, TextHide fits well with the popular framework of fine-tuning pre-trained language models (e. g., BERT) for any sentence or sentence-pair task.
3 code implementations • 6 Oct 2020 • Yangsibo Huang, Zhao Song, Kai Li, Sanjeev Arora
This paper introduces InstaHide, a simple encryption of training images, which can be plugged into existing distributed deep learning pipelines.
no code implementations • 22 May 2020 • Dufan Wu, Daniel Montes, Ziheng Duan, Yangsibo Huang, Javier M. Romero, Ramon Gilberto Gonzalez, Quanzheng Li
Purpose: To develop CADIA, a supervised deep learning model based on a region proposal network coupled with a false-positive reduction module for the detection and localization of intracranial aneurysms (IA) from computed tomography angiography (CTA), and to assess our model's performance to a similar detection network.
no code implementations • 4 Mar 2020 • Yangsibo Huang, Yushan Su, Sachin Ravi, Zhao Song, Sanjeev Arora, Kai Li
This paper attempts to answer the question whether neural network pruning can be used as a tool to achieve differential privacy without losing much data utility.
no code implementations • 19 Apr 2019 • Yunze Man, Yangsibo Huang, Junyi Feng, Xi Li, Fei Wu
Segmentation of pancreas is important for medical image analysis, yet it faces great challenges of class imbalance, background distractions and non-rigid geometrical features.