no code implementations • 13 Mar 2025 • Jing Xu, Franziska Boenisch, Iyiola Emmanuel Olatunji, Adam Dziedzic
Here, a GNN is pre-trained on public data and then adapted to sensitive tasks using lightweight graph prompts.
no code implementations • 27 Feb 2025 • Jing Xu, Franziska Boenisch, Adam Dziedzic
Graph Neural Networks (GNNs) achieve high performance in various real-world applications, such as drug discovery, traffic states prediction, and recommendation systems.
no code implementations • 25 Feb 2025 • Shahrzad Kiani, Nupur Kulkarni, Adam Dziedzic, Stark Draper, Franziska Boenisch
Our framework enables each client to save privacy budget in early rounds so as to be able to spend more in later rounds when additional accuracy is beneficial in learning more fine-grained features.
no code implementations • 11 Feb 2025 • Wenhao Wang, Adam Dziedzic, Grace C. Kim, Michael Backes, Franziska Boenisch
Multi-modal models, such as CLIP, have demonstrated strong performance in aligning visual and textual representations, excelling in tasks like image retrieval and zero-shot classification.
1 code implementation • 7 Feb 2025 • Aditya Kumar, Tom Blanchard, Adam Dziedzic, Franziska Boenisch
Our benchmark aims to guide future efforts in mitigating NSFW text generation in text-to-image models and is available at https://github. com/sprintml/ToxicBench
1 code implementation • 4 Feb 2025 • Antoni Kowalczuk, Jan Dubiński, Franziska Boenisch, Adam Dziedzic
Using this MIA, we perform dataset inference (DI) and find that IARs require as few as six samples to detect dataset membership, compared to 200 for DMs, indicating higher information leakage.
1 code implementation • 19 Nov 2024 • Jan Dubiński, Antoni Kowalczuk, Franziska Boenisch, Adam Dziedzic
CDI relies on dataset inference techniques, i. e., instead of using the membership signal from a single data point, CDI leverages the fact that most data owners, such as providers of stock photography, visual media companies, or even individual artists, own datasets with multiple publicly exposed data points which might all be included in the training of a given DM.
no code implementations • 15 Nov 2024 • Haonan Duan, Adam Dziedzic, Mohammad Yaghini, Nicolas Papernot, Franziska Boenisch
We show that deploying prompted models presents a significant privacy risk for the data used within the prompt by instantiating a highly effective membership inference attack.
no code implementations • 2 Nov 2024 • Vincent Hanke, Tom Blanchard, Franziska Boenisch, Iyiola Emmanuel Olatunji, Michael Backes, Adam Dziedzic
By examining their threat models and thoroughly comparing their performance under different privacy levels according to differential privacy (DP), various LLM architectures, and multiple datasets for classification and generation tasks, we find that: (1) all the methods leak query data, i. e., the (potentially sensitive) user data that is queried at inference time, to the LLM provider, (2) three out of four methods also leak large fractions of private training data to the LLM provider while the method that protects private data requires a local open LLM, (3) all the methods exhibit lower performance compared to three private gradient-based adaptation methods for local open LLMs, and (4) the private adaptation methods for closed LLMs incur higher monetary training and query costs than running the alternative methods on local open LLMs.
no code implementations • 27 Sep 2024 • Wenhao Wang, Adam Dziedzic, Michael Backes, Franziska Boenisch
Recent work on studying memorization in self-supervised learning (SSL) suggests that even though SSL encoders are trained on millions of images, they still memorize individual data points.
1 code implementation • 17 Jul 2024 • Antoni Kowalczuk, Jan Dubiński, Atiyeh Ashari Ghomi, Yi Sui, George Stein, Jiapeng Wu, Jesse C. Cresswell, Franziska Boenisch, Adam Dziedzic
Large-scale vision models have become integral in many applications due to their unprecedented performance and versatility across downstream tasks.
no code implementations • 12 Jun 2024 • Dariush Wahdany, Matthew Jagielski, Adam Dziedzic, Franziska Boenisch
We additionally show that privacy-utility trade-offs can be further improved when leveraging the public data beyond pre-training of the encoder: in particular, we can privately sample our DP prototypes from the publicly available data points used to train the encoder.
1 code implementation • 10 Jun 2024 • Pratyush Maini, Hengrui Jia, Nicolas Papernot, Adam Dziedzic
Instead, we propose a new dataset inference method to accurately identify the datasets used to train large language models.
no code implementations • 5 Jun 2024 • Yihan Wang, Yiwei Lu, Guojun Zhang, Franziska Boenisch, Adam Dziedzic, YaoLiang Yu, Xiao-Shan Gao
Machine unlearning provides viable solutions to revoke the effect of certain training data on pre-trained model parameters.
1 code implementation • 4 Jun 2024 • Dominik Hintersdorf, Lukas Struppek, Kristian Kersting, Adam Dziedzic, Franziska Boenisch
Unfortunately, this practice raises privacy and intellectual property concerns, as DMs can memorize and later reproduce their potentially sensitive or copyrighted training images at inference time.
1 code implementation • 20 May 2024 • Marcin Podhajski, Jan Dubiński, Franziska Boenisch, Adam Dziedzic, Agnieszka Pregowska, Tomasz P. Michalak
Graph Neural Networks (GNNs) are recognized as potent tools for processing real-world data organized in graph structures.
1 code implementation • 31 Jan 2024 • Congyu Fang, Adam Dziedzic, Lin Zhang, Laura Oliva, Amol Verma, Fahad Razak, Nicolas Papernot, Bo wang
In addition, the ML models trained with DeCaPH framework in general outperform those trained solely with the private datasets from individual parties, showing that DeCaPH enhances the model generalizability.
1 code implementation • 19 Jan 2024 • Wenhao Wang, Muhammad Ahmad Kaleem, Adam Dziedzic, Michael Backes, Nicolas Papernot, Franziska Boenisch
Our definition compares the difference in alignment of representations for data points and their augmented views returned by both encoders that were trained on these data points and encoders that were not.
no code implementations • NeurIPS 2023 • Franziska Boenisch, Christopher Mühl, Adam Dziedzic, Roy Rinberg, Nicolas Papernot
DP-SGD is the canonical approach to training models with differential privacy.
no code implementations • 9 Jan 2023 • Franziska Boenisch, Adam Dziedzic, Roei Schuster, Ali Shahin Shamsabadi, Ilia Shumailov, Nicolas Papernot
FL is promoted as a privacy-enhancing technology (PET) that provides data minimization: data never "leaves" personal devices and users share only model updates with a server (e. g., a company) coordinating the distributed training.
no code implementations • 23 Nov 2022 • Adam Dziedzic, Christopher A Choquette-Choo, Natalie Dullerud, Vinith Menon Suriyakumar, Ali Shahin Shamsabadi, Muhammad Ahmad Kaleem, Somesh Jha, Nicolas Papernot, Xiao Wang
We use our mechanisms to enable privacy-preserving multi-label learning in the central setting by extending the canonical single-label technique: PATE.
no code implementations • 16 Sep 2022 • Adam Dziedzic, Haonan Duan, Muhammad Ahmad Kaleem, Nikita Dhawan, Jonas Guan, Yannis Cattan, Franziska Boenisch, Nicolas Papernot
We introduce a new dataset inference defense, which uses the private training set of the victim encoder model to attribute its ownership in the event of stealing.
no code implementations • 25 Jul 2022 • Adam Dziedzic, Stephan Rabanser, Mohammad Yaghini, Armin Ale, Murat A. Erdogdu, Nicolas Papernot
We introduce $p$-DkNN, a novel inference procedure that takes a trained deep neural network and analyzes the similarity structures of its intermediate hidden representations to compute $p$-values associated with the end-to-end model prediction.
no code implementations • 26 May 2022 • Stephan Rabanser, Anvith Thudi, Kimia Hamidieh, Adam Dziedzic, Nicolas Papernot
Selective classification is the task of rejecting inputs a model would predict incorrectly on through a trade-off between input space coverage and model accuracy.
1 code implementation • 16 May 2022 • Adam Dziedzic, Nikita Dhawan, Muhammad Ahmad Kaleem, Jonas Guan, Nicolas Papernot
We construct several novel attacks and find that approaches that train directly on a victim's stolen representations are query efficient and enable high accuracy for downstream models.
no code implementations • 21 Feb 2022 • Franziska Boenisch, Christopher Mühl, Roy Rinberg, Jannis Ihrig, Adam Dziedzic
Applying machine learning (ML) to sensitive domains requires privacy protection of the underlying training data through formal privacy frameworks, such as differential privacy (DP).
no code implementations • ICLR 2022 • Adam Dziedzic, Muhammad Ahmad Kaleem, Yu Shen Lu, Nicolas Papernot
Since we calibrate the effort required to complete the proof-of-work to each query, this only introduces a slight overhead for regular users (up to 2x).
1 code implementation • 6 Dec 2021 • Franziska Boenisch, Adam Dziedzic, Roei Schuster, Ali Shahin Shamsabadi, Ilia Shumailov, Nicolas Papernot
Instead, these devices share gradients, parameters, or other model updates, with a central party (e. g., a company) coordinating the training.
no code implementations • 3 Aug 2021 • Adelin Travers, Lorna Licollari, Guanghan Wang, Varun Chandrasekaran, Adam Dziedzic, David Lie, Nicolas Papernot
In the white-box setting, we instantiate this class with a joint, multi-stage optimization attack.
1 code implementation • ICLR 2021 • Christopher A. Choquette-Choo, Natalie Dullerud, Adam Dziedzic, Yunxiang Zhang, Somesh Jha, Nicolas Papernot, Xiao Wang
There is currently no method that enables machine learning in such a setting, where both confidentiality and privacy need to be preserved, to prevent both explicit and implicit sharing of data.
1 code implementation • ACL 2020 • Dan Hendrycks, Xiaoyuan Liu, Eric Wallace, Adam Dziedzic, Rishabh Krishnan, Dawn Song
Although pretrained Transformers such as BERT achieve high accuracy on in-distribution examples, do they generalize to new distributions?
no code implementations • 18 Mar 2020 • Adam Dziedzic, Vanlin Sathya, Muhammad Iqbal Rochman, Monisha Ghosh, Sanjay Krishnan
The promise of ML techniques in solving non-linear problems influenced this work which aims to apply known ML techniques and develop new ones for wireless spectrum sharing between Wi-Fi and LTE in the unlicensed spectrum.
no code implementations • 8 Feb 2020 • Adam Dziedzic, Sanjay Krishnan
Recent work has extensively shown that randomized perturbations of neural networks can improve robustness to adversarial attacks.
no code implementations • 21 Nov 2019 • Vanlin Sathya, Adam Dziedzic, Monisha Ghosh, Sanjay Krishnan
This approach delivers an accuracy close to 100% compared to auto-correlation (AC) and energy detection (ED) approaches.
no code implementations • 21 Nov 2019 • Adam Dziedzic, John Paparrizos, Sanjay Krishnan, Aaron Elmore, Michael Franklin
The convolutional layers are core building blocks of neural network architectures.
no code implementations • 25 Sep 2019 • Adam Dziedzic, Sanjay Krishnan
The existence of adversarial examples, or intentional mis-predictions constructed from small changes to correctly predicted examples, is one of the most significant challenges in neural network research today.
3 code implementations • 21 Aug 2019 • Max Kaufmann, Daniel Kang, Yi Sun, Steven Basart, Xuwang Yin, Mantas Mazeika, Akul Arora, Adam Dziedzic, Franziska Boenisch, Tom Brown, Jacob Steinhardt, Dan Hendrycks
To narrow in on this discrepancy between research and reality we introduce ImageNet-UA, a framework for evaluating model robustness against a range of unforeseen adversaries, including eighteen new non-L_p attacks.