Search Results for author: Huseyin A. Inan

Found 11 papers, 2 papers with code

When Does Differentially Private Learning Not Suffer in High Dimensions?

1 code implementation1 Jul 2022 Xuechen Li, Daogao Liu, Tatsunori Hashimoto, Huseyin A. Inan, Janardhan Kulkarni, Yin Tat Lee, Abhradeep Guha Thakurta

Large pretrained models can be privately fine-tuned to achieve performance approaching that of non-private models.

Privacy Leakage in Text Classification: A Data Extraction Approach

no code implementations9 Jun 2022 Adel Elmahdy, Huseyin A. Inan, Robert Sim

Recent work has demonstrated the successful extraction of training data from generative language models.

Classification Memorization +2

Differentially Private Fine-tuning of Language Models

2 code implementations ICLR 2022 Da Yu, Saurabh Naik, Arturs Backurs, Sivakanth Gopi, Huseyin A. Inan, Gautam Kamath, Janardhan Kulkarni, Yin Tat Lee, Andre Manoel, Lukas Wutschitz, Sergey Yekhanin, Huishuai Zhang

For example, on the MNLI dataset we achieve an accuracy of $87. 8\%$ using RoBERTa-Large and $83. 5\%$ using RoBERTa-Base with a privacy budget of $\epsilon = 6. 7$.

Text Generation

Membership Inference on Word Embedding and Beyond

no code implementations21 Jun 2021 Saeed Mahloujifar, Huseyin A. Inan, Melissa Chase, Esha Ghosh, Marcello Hasegawa

Indeed, our attack is a cheaper membership inference attack on text-generative models, which does not require the knowledge of the target model or any expensive training of text-generative models as shadow models.

Inference Attack Language Modelling +3

Privacy Regularization: Joint Privacy-Utility Optimization in Language Models

no code implementations12 Mar 2021 FatemehSadat Mireshghallah, Huseyin A. Inan, Marcello Hasegawa, Victor Rühle, Taylor Berg-Kirkpatrick, Robert Sim

In this work, we introduce two privacy-preserving regularization methods for training language models that enable joint optimization of utility and privacy through (1) the use of a discriminator and (2) the inclusion of a triplet-loss term.

Memorization Privacy Preserving

Training Data Leakage Analysis in Language Models

no code implementations14 Jan 2021 Huseyin A. Inan, Osman Ramadan, Lukas Wutschitz, Daniel Jones, Victor Rühle, James Withers, Robert Sim

It has been demonstrated that strong performance of language models comes along with the ability to memorize rare training samples, which poses serious privacy threats in case the model is trained on confidential user content.

rTop-k: A Statistical Estimation Approach to Distributed SGD

no code implementations21 May 2020 Leighton Pate Barnes, Huseyin A. Inan, Berivan Isik, Ayfer Ozgur

The statistically optimal communication scheme arising from the analysis of this model leads to a new sparsification technique for SGD, which concatenates random-k and top-k, considered separately in the prior literature.

Cannot find the paper you are looking for? You can Submit a new open access paper.