1 code implementation • 10 Jun 2022 • Santiago Zanella-Béguelin, Lukas Wutschitz, Shruti Tople, Ahmed Salem, Victor Rühle, Andrew Paverd, Mohammad Naseri, Boris Köpf, Daniel Jones
Our Bayesian method exploits the hypothesis testing interpretation of differential privacy to obtain a posterior for $\varepsilon$ (not just a confidence interval) from the joint posterior of the false positive and false negative rates of membership inference attacks.
no code implementations • 14 Jan 2021 • Huseyin A. Inan, Osman Ramadan, Lukas Wutschitz, Daniel Jones, Victor Rühle, James Withers, Robert Sim
It has been demonstrated that strong performance of language models comes along with the ability to memorize rare training samples, which poses serious privacy threats in case the model is trained on confidential user content.
1 code implementation • 8 Jun 2023 • Ana-Maria Cretu, Daniel Jones, Yves-Alexandre de Montjoye, Shruti Tople
We here present the first systematic analysis of the causes of misalignment in shadow models and show the use of a different weight initialisation to be the main cause.