Search Results for author: Huseyin A. Inan

Found 14 papers, 5 papers with code

Differentially Private Training of Mixture of Experts Models

no code implementations11 Feb 2024 Pierre Tholoniat, Huseyin A. Inan, Janardhan Kulkarni, Robert Sim

This position paper investigates the integration of Differential Privacy (DP) in the training of Mixture of Experts (MoE) models within the field of natural language processing.

Computational Efficiency Privacy Preserving

Privately Aligning Language Models with Reinforcement Learning

no code implementations25 Oct 2023 Fan Wu, Huseyin A. Inan, Arturs Backurs, Varun Chandrasekaran, Janardhan Kulkarni, Robert Sim

Positioned between pre-training and user deployment, aligning large language models (LLMs) through reinforcement learning (RL) has emerged as a prevailing strategy for training instruction following-models such as ChatGPT.

Instruction Following Privacy Preserving +3

Synthetic Text Generation with Differential Privacy: A Simple and Practical Recipe

1 code implementation25 Oct 2022 Xiang Yue, Huseyin A. Inan, Xuechen Li, Girish Kumar, Julia McAnallen, Hoda Shajari, Huan Sun, David Levitan, Robert Sim

Privacy concerns have attracted increasing attention in data-driven products due to the tendency of machine learning models to memorize sensitive training data.

Language Modelling Text Generation

Privacy Leakage in Text Classification: A Data Extraction Approach

no code implementations9 Jun 2022 Adel Elmahdy, Huseyin A. Inan, Robert Sim

Recent work has demonstrated the successful extraction of training data from generative language models.

Memorization text-classification +1

Differentially Private Fine-tuning of Language Models

2 code implementations ICLR 2022 Da Yu, Saurabh Naik, Arturs Backurs, Sivakanth Gopi, Huseyin A. Inan, Gautam Kamath, Janardhan Kulkarni, Yin Tat Lee, Andre Manoel, Lukas Wutschitz, Sergey Yekhanin, Huishuai Zhang

For example, on the MNLI dataset we achieve an accuracy of $87. 8\%$ using RoBERTa-Large and $83. 5\%$ using RoBERTa-Base with a privacy budget of $\epsilon = 6. 7$.

Text Generation

Membership Inference on Word Embedding and Beyond

no code implementations21 Jun 2021 Saeed Mahloujifar, Huseyin A. Inan, Melissa Chase, Esha Ghosh, Marcello Hasegawa

Indeed, our attack is a cheaper membership inference attack on text-generative models, which does not require the knowledge of the target model or any expensive training of text-generative models as shadow models.

Inference Attack Language Modelling +3

Privacy Regularization: Joint Privacy-Utility Optimization in Language Models

no code implementations12 Mar 2021 FatemehSadat Mireshghallah, Huseyin A. Inan, Marcello Hasegawa, Victor Rühle, Taylor Berg-Kirkpatrick, Robert Sim

In this work, we introduce two privacy-preserving regularization methods for training language models that enable joint optimization of utility and privacy through (1) the use of a discriminator and (2) the inclusion of a triplet-loss term.

Memorization Privacy Preserving

Training Data Leakage Analysis in Language Models

no code implementations14 Jan 2021 Huseyin A. Inan, Osman Ramadan, Lukas Wutschitz, Daniel Jones, Victor Rühle, James Withers, Robert Sim

It has been demonstrated that strong performance of language models comes along with the ability to memorize rare training samples, which poses serious privacy threats in case the model is trained on confidential user content.

Sentence

rTop-k: A Statistical Estimation Approach to Distributed SGD

no code implementations21 May 2020 Leighton Pate Barnes, Huseyin A. Inan, Berivan Isik, Ayfer Ozgur

The statistically optimal communication scheme arising from the analysis of this model leads to a new sparsification technique for SGD, which concatenates random-k and top-k, considered separately in the prior literature.

Cannot find the paper you are looking for? You can Submit a new open access paper.