Search Results for author: Vitaly Shmatikov

Found 26 papers, 16 papers with code

How To Break Anonymity of the Netflix Prize Dataset

no code implementations18 Oct 2006 Arvind Narayanan, Vitaly Shmatikov

We present a new class of statistical de-anonymization attacks against high-dimensional micro-data, such as individual preferences, recommendations, transaction records and so on.

Cryptography and Security Databases

Can we still avoid automatic face detection?

9 code implementations14 Feb 2016 Michael J. Wilber, Vitaly Shmatikov, Serge Belongie

In this setting, is it still possible for privacy-conscientious users to avoid automatic face detection and recognition?

Face Detection Face Recognition

Defeating Image Obfuscation with Deep Learning

no code implementations1 Sep 2016 Richard McPherson, Reza Shokri, Vitaly Shmatikov

We demonstrate that modern image recognition methods based on artificial neural networks can recover hidden information from images protected by various forms of obfuscation.

Privacy Preserving

Membership Inference Attacks against Machine Learning Models

12 code implementations18 Oct 2016 Reza Shokri, Marco Stronati, Congzheng Song, Vitaly Shmatikov

We quantitatively investigate how machine learning models leak information about the individual data records on which they were trained.

BIG-bench Machine Learning General Classification +2

Machine Learning Models that Remember Too Much

1 code implementation22 Sep 2017 Congzheng Song, Thomas Ristenpart, Vitaly Shmatikov

In this setting, we design and implement practical algorithms, some of them very similar to standard ML techniques such as regularization and data augmentation, that "memorize" information about the training dataset in the model yet the model is as accurate and predictive as a conventionally trained model.

BIG-bench Machine Learning Data Augmentation +2

Fooling OCR Systems with Adversarial Text Images

no code implementations15 Feb 2018 Congzheng Song, Vitaly Shmatikov

We demonstrate that state-of-the-art optical character recognition (OCR) based on deep learning is vulnerable to adversarial images.

Adversarial Text Optical Character Recognition +1

Chiron: Privacy-preserving Machine Learning as a Service

no code implementations15 Mar 2018 Tyler Hunt, Congzheng Song, Reza Shokri, Vitaly Shmatikov, Emmett Witchel

Existing ML-as-a-service platforms require users to reveal all training data to the service operator.

Cryptography and Security

Exploiting Unintended Feature Leakage in Collaborative Learning

1 code implementation10 May 2018 Luca Melis, Congzheng Song, Emiliano De Cristofaro, Vitaly Shmatikov

First, we show that an adversarial participant can infer the presence of exact data points -- for example, specific locations -- in others' training data (i. e., membership inference).

Federated Learning

How To Backdoor Federated Learning

3 code implementations2 Jul 2018 Eugene Bagdasaryan, Andreas Veit, Yiqing Hua, Deborah Estrin, Vitaly Shmatikov

An attacker selected in a single round of federated learning can cause the global model to immediately reach 100% accuracy on the backdoor task.

Anomaly Detection Data Poisoning +2

Auditing Data Provenance in Text-Generation Models

2 code implementations1 Nov 2018 Congzheng Song, Vitaly Shmatikov

To help enforce data-protection regulations such as GDPR and detect unauthorized uses of personal data, we develop a new \emph{model auditing} technique that helps users check if their data was used to train a machine learning model.

Memorization Text Generation

Overlearning Reveals Sensitive Attributes

no code implementations ICLR 2020 Congzheng Song, Vitaly Shmatikov

For example, a binary gender classifier of facial images also learns to recognize races\textemdash even races that are not represented in the training data\textemdash and identities.

Humpty Dumpty: Controlling Word Meanings via Corpus Poisoning

no code implementations14 Jan 2020 Roei Schuster, Tal Schuster, Yoav Meri, Vitaly Shmatikov

Word embeddings, i. e., low-dimensional vector representations such as GloVe and SGNS, encode word "meaning" in the sense that distances between words' vectors correspond to their semantic proximity.

Data Poisoning Information Retrieval +7

Salvaging Federated Learning by Local Adaptation

2 code implementations12 Feb 2020 Tao Yu, Eugene Bagdasaryan, Vitaly Shmatikov

First, we show that on standard tasks such as next-word prediction, many participants gain no benefit from FL because the federated model is less accurate on their data than the models they can train locally on their own.

Federated Learning Knowledge Distillation +1

Blind Backdoors in Deep Learning Models

1 code implementation8 May 2020 Eugene Bagdasaryan, Vitaly Shmatikov

We investigate a new method for injecting backdoors into machine learning models, based on compromising the loss-value computation in the model-training code.

Spinning Sequence-to-Sequence Models with Meta-Backdoors

no code implementations22 Jul 2021 Eugene Bagdasaryan, Vitaly Shmatikov

We introduce the concept of a "meta-backdoor" to explain model-spinning attacks.

Sentiment Analysis

Spinning Language Models: Risks of Propaganda-As-A-Service and Countermeasures

1 code implementation9 Dec 2021 Eugene Bagdasaryan, Vitaly Shmatikov

Whereas conventional backdoors cause models to produce incorrect outputs on inputs with the trigger, outputs of spinned models preserve context and maintain standard accuracy metrics, yet also satisfy a meta-task chosen by the adversary.

Text Generation

Data Isotopes for Data Provenance in DNNs

no code implementations29 Aug 2022 Emily Wenger, Xiuyu Li, Ben Y. Zhao, Vitaly Shmatikov

With only query access to a trained model and no knowledge of the model training process, or control of the data labels, a user can apply statistical hypothesis testing to detect if a model has learned the spurious features associated with their isotopes by training on the user's data.

Memorization

Mithridates: Auditing and Boosting Backdoor Resistance of Machine Learning Pipelines

1 code implementation9 Feb 2023 Eugene Bagdasaryan, Vitaly Shmatikov

Given the variety of potential backdoor attacks, ML engineers who are not security experts have no way to measure how vulnerable their current training pipelines are, nor do they have a practical way to compare training configurations so as to pick the more resistant ones.

AutoML Federated Learning

Abusing Images and Sounds for Indirect Instruction Injection in Multi-Modal LLMs

1 code implementation19 Jul 2023 Eugene Bagdasaryan, Tsung-Yin Hsieh, Ben Nassi, Vitaly Shmatikov

We demonstrate how images and sounds can be used for indirect prompt and instruction injection in multi-modal LLMs.

Adversarial Illusions in Multi-Modal Embeddings

1 code implementation22 Aug 2023 Tingwei Zhang, Rishi Jha, Eugene Bagdasaryan, Vitaly Shmatikov

In this paper, we show that multi-modal embeddings can be vulnerable to an attack we call "adversarial illusions."

Image Generation Text Generation +1

Text Embeddings Reveal (Almost) As Much As Text

1 code implementation10 Oct 2023 John X. Morris, Volodymyr Kuleshov, Vitaly Shmatikov, Alexander M. Rush

How much private information do text embeddings reveal about the original text?

Language Model Inversion

2 code implementations22 Nov 2023 John X. Morris, Wenting Zhao, Justin T. Chiu, Vitaly Shmatikov, Alexander M. Rush

We consider the problem of language model inversion and show that next-token probabilities contain a surprising amount of information about the preceding text.

Language Modelling

Cannot find the paper you are looking for? You can Submit a new open access paper.