no code implementations • NAACL (PrivateNLP) 2022 • Adel Elmahdy, Huseyin A. Inan, Robert Sim
Recent work has demonstrated the successful extraction of training data from generative language models.
no code implementations • 11 Feb 2024 • Pierre Tholoniat, Huseyin A. Inan, Janardhan Kulkarni, Robert Sim
This position paper investigates the integration of Differential Privacy (DP) in the training of Mixture of Experts (MoE) models within the field of natural language processing.
no code implementations • 25 Oct 2023 • Fan Wu, Huseyin A. Inan, Arturs Backurs, Varun Chandrasekaran, Janardhan Kulkarni, Robert Sim
Positioned between pre-training and user deployment, aligning large language models (LLMs) through reinforcement learning (RL) has emerged as a prevailing strategy for training instruction following-models such as ChatGPT.
1 code implementation • 21 Sep 2023 • Xinyu Tang, Richard Shin, Huseyin A. Inan, Andre Manoel, FatemehSadat Mireshghallah, Zinan Lin, Sivakanth Gopi, Janardhan Kulkarni, Robert Sim
Our results demonstrate that our algorithm can achieve competitive performance with strong privacy levels.
1 code implementation • 16 Dec 2022 • C. M. Downey, Wei Dai, Huseyin A. Inan, Kim Laine, Saurabh Naik, Tomasz Religa
Language models are widely deployed to provide automatic text completion services in user products.
1 code implementation • 25 Oct 2022 • Xiang Yue, Huseyin A. Inan, Xuechen Li, Girish Kumar, Julia McAnallen, Hoda Shajari, Huan Sun, David Levitan, Robert Sim
Privacy concerns have attracted increasing attention in data-driven products due to the tendency of machine learning models to memorize sensitive training data.
1 code implementation • 1 Jul 2022 • Xuechen Li, Daogao Liu, Tatsunori Hashimoto, Huseyin A. Inan, Janardhan Kulkarni, Yin Tat Lee, Abhradeep Guha Thakurta
Large pretrained models can be privately fine-tuned to achieve performance approaching that of non-private models.
no code implementations • 9 Jun 2022 • Adel Elmahdy, Huseyin A. Inan, Robert Sim
Recent work has demonstrated the successful extraction of training data from generative language models.
2 code implementations • ICLR 2022 • Da Yu, Saurabh Naik, Arturs Backurs, Sivakanth Gopi, Huseyin A. Inan, Gautam Kamath, Janardhan Kulkarni, Yin Tat Lee, Andre Manoel, Lukas Wutschitz, Sergey Yekhanin, Huishuai Zhang
For example, on the MNLI dataset we achieve an accuracy of $87. 8\%$ using RoBERTa-Large and $83. 5\%$ using RoBERTa-Base with a privacy budget of $\epsilon = 6. 7$.
no code implementations • 21 Jun 2021 • Saeed Mahloujifar, Huseyin A. Inan, Melissa Chase, Esha Ghosh, Marcello Hasegawa
Indeed, our attack is a cheaper membership inference attack on text-generative models, which does not require the knowledge of the target model or any expensive training of text-generative models as shadow models.
no code implementations • 12 Mar 2021 • FatemehSadat Mireshghallah, Huseyin A. Inan, Marcello Hasegawa, Victor Rühle, Taylor Berg-Kirkpatrick, Robert Sim
In this work, we introduce two privacy-preserving regularization methods for training language models that enable joint optimization of utility and privacy through (1) the use of a discriminator and (2) the inclusion of a triplet-loss term.
no code implementations • 14 Jan 2021 • Huseyin A. Inan, Osman Ramadan, Lukas Wutschitz, Daniel Jones, Victor Rühle, James Withers, Robert Sim
It has been demonstrated that strong performance of language models comes along with the ability to memorize rare training samples, which poses serious privacy threats in case the model is trained on confidential user content.
no code implementations • 21 May 2020 • Leighton Pate Barnes, Huseyin A. Inan, Berivan Isik, Ayfer Ozgur
The statistically optimal communication scheme arising from the analysis of this model leads to a new sparsification technique for SGD, which concatenates random-k and top-k, considered separately in the prior literature.
no code implementations • ICLR 2020 • Huseyin A. Inan, Gaurav Singh Tomar, Huapu Pan
In this work, we propose a generator-reranker architecture for semantic parsing.