Search Results for author: Congzheng Song

Found 17 papers, 6 papers with code

FLAIR: Federated Learning Annotated Image Repository

1 code implementation18 Jul 2022 Congzheng Song, Filip Granqvist, Kunal Talwar

We believe FLAIR can serve as a challenging benchmark for advancing the state-of-the art in federated learning.

Federated Learning Multi-Label Classification

Training a Tokenizer for Free with Private Federated Learning

no code implementations15 Mar 2022 Eugene Bagdasaryan, Congzheng Song, Rogier Van Dalen, Matt Seigel, Áine Cahill

During private federated learning of the language model, we sample from the model, train a new tokenizer on the sampled sequences, and update the model embeddings.

Federated Learning Language Modelling

Information Leakage in Embedding Models

no code implementations31 Mar 2020 Congzheng Song, Ananth Raghunathan

We demonstrate that embeddings, in addition to encoding generic semantics, often also present a vector that leaks sensitive information about the input data.

Sentence Embeddings

Generalized Zero-shot ICD Coding

no code implementations28 Sep 2019 Congzheng Song, Shanghang Zhang, Najmeh Sadoughi, Pengtao Xie, Eric Xing

The International Classification of Diseases (ICD) is a list of classification codes for the diagnoses.

Classification General Classification +3

Robust Membership Encoding: Inference Attacks and Copyright Protection for Deep Learning

no code implementations27 Sep 2019 Congzheng Song, Reza Shokri

In this paper, we present \emph{membership encoding} for training deep neural networks and encoding the membership information, i. e. whether a data point is used for training, for a subset of training data.

Model Compression

Overlearning Reveals Sensitive Attributes

no code implementations ICLR 2020 Congzheng Song, Vitaly Shmatikov

For example, a binary gender classifier of facial images also learns to recognize races\textemdash even races that are not represented in the training data\textemdash and identities.

Auditing Data Provenance in Text-Generation Models

2 code implementations1 Nov 2018 Congzheng Song, Vitaly Shmatikov

To help enforce data-protection regulations such as GDPR and detect unauthorized uses of personal data, we develop a new \emph{model auditing} technique that helps users check if their data was used to train a machine learning model.

Text Generation

Exploiting Unintended Feature Leakage in Collaborative Learning

1 code implementation10 May 2018 Luca Melis, Congzheng Song, Emiliano De Cristofaro, Vitaly Shmatikov

First, we show that an adversarial participant can infer the presence of exact data points -- for example, specific locations -- in others' training data (i. e., membership inference).

Federated Learning

Chiron: Privacy-preserving Machine Learning as a Service

no code implementations15 Mar 2018 Tyler Hunt, Congzheng Song, Reza Shokri, Vitaly Shmatikov, Emmett Witchel

Existing ML-as-a-service platforms require users to reveal all training data to the service operator.

Cryptography and Security

Fooling OCR Systems with Adversarial Text Images

no code implementations15 Feb 2018 Congzheng Song, Vitaly Shmatikov

We demonstrate that state-of-the-art optical character recognition (OCR) based on deep learning is vulnerable to adversarial images.

Adversarial Text Optical Character Recognition

Kernel Distillation for Fast Gaussian Processes Prediction

no code implementations31 Jan 2018 Congzheng Song, Yiming Sun

Gaussian processes (GPs) are flexible models that can capture complex structure in large-scale dataset due to their non-parametric nature.

Gaussian Processes

Machine Learning Models that Remember Too Much

no code implementations22 Sep 2017 Congzheng Song, Thomas Ristenpart, Vitaly Shmatikov

In this setting, we design and implement practical algorithms, some of them very similar to standard ML techniques such as regularization and data augmentation, that "memorize" information about the training dataset in the model yet the model is as accurate and predictive as a conventionally trained model.

BIG-bench Machine Learning Data Augmentation +2

Membership Inference Attacks against Machine Learning Models

10 code implementations18 Oct 2016 Reza Shokri, Marco Stronati, Congzheng Song, Vitaly Shmatikov

We quantitatively investigate how machine learning models leak information about the individual data records on which they were trained.

BIG-bench Machine Learning General Classification +2

Learning Genomic Representations to Predict Clinical Outcomes in Cancer

1 code implementation27 Sep 2016 Safoora Yousefi, Congzheng Song, Nelson Nauata, Lee Cooper

Genomics are rapidly transforming medical practice and basic biomedical research, providing insights into disease mechanisms and improving therapeutic strategies, particularly in cancer.

Survival Analysis

Cannot find the paper you are looking for? You can Submit a new open access paper.