no code implementations • 15 Feb 2018 • Congzheng Song, Vitaly Shmatikov
We demonstrate that state-of-the-art optical character recognition (OCR) based on deep learning is vulnerable to adversarial images.
no code implementations • 31 Jan 2018 • Congzheng Song, Yiming Sun
Gaussian processes (GPs) are flexible models that can capture complex structure in large-scale dataset due to their non-parametric nature.
1 code implementation • 27 Sep 2016 • Safoora Yousefi, Congzheng Song, Nelson Nauata, Lee Cooper
Genomics are rapidly transforming medical practice and basic biomedical research, providing insights into disease mechanisms and improving therapeutic strategies, particularly in cancer.
no code implementations • 15 Mar 2018 • Tyler Hunt, Congzheng Song, Reza Shokri, Vitaly Shmatikov, Emmett Witchel
Existing ML-as-a-service platforms require users to reveal all training data to the service operator.
Cryptography and Security
no code implementations • ICLR 2020 • Congzheng Song, Vitaly Shmatikov
For example, a binary gender classifier of facial images also learns to recognize races\textemdash even races that are not represented in the training data\textemdash and identities.
no code implementations • 28 Sep 2019 • Congzheng Song, Shanghang Zhang, Najmeh Sadoughi, Pengtao Xie, Eric Xing
The International Classification of Diseases (ICD) is a list of classification codes for the diagnoses.
no code implementations • 27 Sep 2019 • Congzheng Song, Reza Shokri
In this paper, we present \emph{membership encoding} for training deep neural networks and encoding the membership information, i. e. whether a data point is used for training, for a subset of training data.
no code implementations • 31 Mar 2020 • Congzheng Song, Ananth Raghunathan
We demonstrate that embeddings, in addition to encoding generic semantics, often also present a vector that leaks sensitive information about the input data.
no code implementations • 5 Jul 2020 • Roei Schuster, Congzheng Song, Eran Tromer, Vitaly Shmatikov
We demonstrate that neural code autocompleters are vulnerable to poisoning attacks.
no code implementations • 15 Mar 2022 • Eugene Bagdasaryan, Congzheng Song, Rogier Van Dalen, Matt Seigel, Áine Cahill
During private federated learning of the language model, we sample from the model, train a new tokenizer on the sampled sequences, and update the model embeddings.
no code implementations • 18 Jul 2022 • MingBin Xu, Congzheng Song, Ye Tian, Neha Agrawal, Filip Granqvist, Rogier Van Dalen, Xiao Zhang, Arturo Argueta, Shiyi Han, Yaqiao Deng, Leo Liu, Anmol Walia, Alex Jin
Our goal is to train a large neural network language model (NNLM) on compute-constrained devices while preserving privacy using FL and DP.
no code implementations • 14 Jul 2023 • Tatsuki Koga, Congzheng Song, Martin Pelikan, Mona Chitnis
Federated learning (FL) combined with differential privacy (DP) offers machine learning (ML) training with distributed devices and with a formal privacy guarantee.
no code implementations • 27 Jul 2023 • Kunal Talwar, Shan Wang, Audra McMillan, Vojta Jina, Vitaly Feldman, Bailey Basile, Aine Cahill, Yi Sheng Chan, Mike Chatzidakis, Junye Chen, Oliver Chick, Mona Chitnis, Suman Ganta, Yusuf Goren, Filip Granqvist, Kristine Guo, Frederic Jacobs, Omid Javidbakht, Albert Liu, Richard Low, Dan Mascenik, Steve Myers, David Park, Wonhee Park, Gianni Parsa, Tommy Pauly, Christian Priebe, Rehan Rishi, Guy Rothblum, Michael Scaria, Linmao Song, Congzheng Song, Karl Tarbe, Sebastian Vogt, Luke Winstrom, Shundong Zhou
We revisit the problem of designing scalable protocols for private statistics and private federated learning when each device holds its private data.
no code implementations • 14 Feb 2024 • Tao Yu, Congzheng Song, Jianyu Wang, Mona Chitnis
Asynchronous protocols have been shown to improve the scalability of federated learning (FL) with a massive number of clients.
2 code implementations • 1 Nov 2018 • Congzheng Song, Vitaly Shmatikov
To help enforce data-protection regulations such as GDPR and detect unauthorized uses of personal data, we develop a new \emph{model auditing} technique that helps users check if their data was used to train a machine learning model.
1 code implementation • EMNLP 2020 • Congzheng Song, Alexander M. Rush, Vitaly Shmatikov
We study semantic collisions: texts that are semantically unrelated but judged as similar by NLP models.
1 code implementation • 22 Sep 2017 • Congzheng Song, Thomas Ristenpart, Vitaly Shmatikov
In this setting, we design and implement practical algorithms, some of them very similar to standard ML techniques such as regularization and data augmentation, that "memorize" information about the training dataset in the model yet the model is as accurate and predictive as a conventionally trained model.
1 code implementation • 10 May 2018 • Luca Melis, Congzheng Song, Emiliano De Cristofaro, Vitaly Shmatikov
First, we show that an adversarial participant can infer the presence of exact data points -- for example, specific locations -- in others' training data (i. e., membership inference).
1 code implementation • 18 Jul 2022 • Congzheng Song, Filip Granqvist, Kunal Talwar
We believe FLAIR can serve as a challenging benchmark for advancing the state-of-the art in federated learning.
1 code implementation • 9 Apr 2024 • Filip Granqvist, Congzheng Song, Áine Cahill, Rogier Van Dalen, Martin Pelikan, Yi Sheng Chan, Xiaojun Feng, Natarajan Krishnaswami, Vojta Jina, Mona Chitnis
Federated learning (FL) is an emerging machine learning (ML) training paradigm where clients own their data and collaborate to train a global model, without revealing any data to the server and other participants.
11 code implementations • 18 Oct 2016 • Reza Shokri, Marco Stronati, Congzheng Song, Vitaly Shmatikov
We quantitatively investigate how machine learning models leak information about the individual data records on which they were trained.