Search Results for author: Vitaly Shmatikov

Found 37 papers, 22 papers with code

Approximating Language Model Training Data from Weights

no code implementations18 Jun 2025 John X. Morris, Junjie Oscar Yin, Woojeong Kim, Vitaly Shmatikov, Alexander M. Rush

When applied to a model trained with SFT on MSMARCO web documents, our method reduces perplexity from 3. 3 to 2. 3, compared to an expert LLAMA model's perplexity of 2. 0.

Harnessing the Universal Geometry of Embeddings

2 code implementations18 May 2025 Rishi Jha, Collin Zhang, Vitaly Shmatikov, John X. Morris

We introduce the first method for translating text embeddings from one vector space to another without any paired data, encoders, or predefined sets of matches.

Attribute

Universal Zero-shot Embedding Inversion

1 code implementation31 Mar 2025 Collin Zhang, John X. Morris, Vitaly Shmatikov

From the NLP perspective, it helps determine how much semantic information about the input is retained in the embedding.

Multi-Agent Systems Execute Arbitrary Malicious Code

no code implementations15 Mar 2025 Harold Triedman, Rishi Jha, Vitaly Shmatikov

Multi-agent systems coordinate LLM-based agents to perform tasks on users' behalf.

Rerouting LLM Routers

no code implementations3 Jan 2025 Avital Shafran, Roei Schuster, Thomas Ristenpart, Vitaly Shmatikov

LLM routers aim to balance quality and cost of generation by classifying queries and routing them to a cheaper or more expensive LLM depending on their complexity.

Adversarial Robustness

Adversarial Hubness in Multi-Modal Retrieval

1 code implementation18 Dec 2024 Tingwei Zhang, Fnu Suya, Rishi Jha, Collin Zhang, Vitaly Shmatikov

For example, in text-caption-to-image retrieval, a single adversarial hub, generated with respect to 100 randomly selected target queries, is retrieved as the top-1 most relevant image for more than 21, 000 out of 25, 000 test queries (by contrast, the most common natural hub is the top-1 response to only 102 queries), demonstrating the strong generalization capabilities of adversarial hubs.

Image Retrieval Information Retrieval +1

Adversarial Decoding: Generating Readable Documents for Adversarial Objectives

1 code implementation3 Oct 2024 Collin Zhang, Tingwei Zhang, Vitaly Shmatikov

We design, implement, and evaluate adversarial decoding, a new, generic text generation technique that produces readable documents for different adversarial objectives.

Adversarial Text RAG +2

Self-interpreting Adversarial Images

1 code implementation12 Jul 2024 Tingwei Zhang, Collin Zhang, John X. Morris, Eugene Bagdasarian, Vitaly Shmatikov

We introduce a new type of indirect, cross-modal injection attacks against visual language models that enable creation of self-interpreting images.

Misinformation

Machine Against the RAG: Jamming Retrieval-Augmented Generation with Blocker Documents

no code implementations9 Jun 2024 Avital Shafran, Roei Schuster, Vitaly Shmatikov

Retrieval-augmented generation (RAG) systems respond to queries by retrieving relevant documents from a knowledge database and applying an LLM to the retrieved documents.

RAG Retrieval +1

Extracting Prompts by Inverting LLM Outputs

1 code implementation23 May 2024 Collin Zhang, John X. Morris, Vitaly Shmatikov

We consider the problem of language model inversion: given outputs of a language model, we seek to extract the prompt that generated these outputs.

Language Modeling Language Modelling

Language Model Inversion

2 code implementations22 Nov 2023 John X. Morris, Wenting Zhao, Justin T. Chiu, Vitaly Shmatikov, Alexander M. Rush

We consider the problem of language model inversion and show that next-token probabilities contain a surprising amount of information about the preceding text.

Language Modeling Language Modelling +1

Text Embeddings Reveal (Almost) As Much As Text

1 code implementation10 Oct 2023 John X. Morris, Volodymyr Kuleshov, Vitaly Shmatikov, Alexander M. Rush

How much private information do text embeddings reveal about the original text?

Adversarial Illusions in Multi-Modal Embeddings

1 code implementation22 Aug 2023 Tingwei Zhang, Rishi Jha, Eugene Bagdasaryan, Vitaly Shmatikov

In this paper, we show that multi-modal embeddings can be vulnerable to an attack we call "adversarial illusions."

Image Generation Text Generation +1

Abusing Images and Sounds for Indirect Instruction Injection in Multi-Modal LLMs

1 code implementation19 Jul 2023 Eugene Bagdasaryan, Tsung-Yin Hsieh, Ben Nassi, Vitaly Shmatikov

We demonstrate how images and sounds can be used for indirect prompt and instruction injection in multi-modal LLMs.

Mithridates: Auditing and Boosting Backdoor Resistance of Machine Learning Pipelines

1 code implementation9 Feb 2023 Eugene Bagdasaryan, Vitaly Shmatikov

Given the variety of potential backdoor attacks, ML engineers who are not security experts have no way to measure how vulnerable their current training pipelines are, nor do they have a practical way to compare training configurations so as to pick the more resistant ones.

AutoML Federated Learning

Data Isotopes for Data Provenance in DNNs

no code implementations29 Aug 2022 Emily Wenger, Xiuyu Li, Ben Y. Zhao, Vitaly Shmatikov

With only query access to a trained model and no knowledge of the model training process, or control of the data labels, a user can apply statistical hypothesis testing to detect if a model has learned the spurious features associated with their isotopes by training on the user's data.

Memorization

Spinning Language Models: Risks of Propaganda-As-A-Service and Countermeasures

1 code implementation9 Dec 2021 Eugene Bagdasaryan, Vitaly Shmatikov

Whereas conventional backdoors cause models to produce incorrect outputs on inputs with the trigger, outputs of spinned models preserve context and maintain standard accuracy metrics, yet also satisfy a meta-task chosen by the adversary.

Text Generation

Spinning Sequence-to-Sequence Models with Meta-Backdoors

no code implementations22 Jul 2021 Eugene Bagdasaryan, Vitaly Shmatikov

We introduce the concept of a "meta-backdoor" to explain model-spinning attacks.

Sentiment Analysis

Blind Backdoors in Deep Learning Models

1 code implementation8 May 2020 Eugene Bagdasaryan, Vitaly Shmatikov

We investigate a new method for injecting backdoors into machine learning models, based on compromising the loss-value computation in the model-training code.

Deep Learning

Salvaging Federated Learning by Local Adaptation

2 code implementations12 Feb 2020 Tao Yu, Eugene Bagdasaryan, Vitaly Shmatikov

First, we show that on standard tasks such as next-word prediction, many participants gain no benefit from FL because the federated model is less accurate on their data than the models they can train locally on their own.

Federated Learning Knowledge Distillation +1

Humpty Dumpty: Controlling Word Meanings via Corpus Poisoning

no code implementations14 Jan 2020 Roei Schuster, Tal Schuster, Yoav Meri, Vitaly Shmatikov

Word embeddings, i. e., low-dimensional vector representations such as GloVe and SGNS, encode word "meaning" in the sense that distances between words' vectors correspond to their semantic proximity.

Data Poisoning Information Retrieval +7

Overlearning Reveals Sensitive Attributes

no code implementations ICLR 2020 Congzheng Song, Vitaly Shmatikov

For example, a binary gender classifier of facial images also learns to recognize races\textemdash even races that are not represented in the training data\textemdash and identities.

Auditing Data Provenance in Text-Generation Models

2 code implementations1 Nov 2018 Congzheng Song, Vitaly Shmatikov

To help enforce data-protection regulations such as GDPR and detect unauthorized uses of personal data, we develop a new \emph{model auditing} technique that helps users check if their data was used to train a machine learning model.

Memorization Text Generation

How To Backdoor Federated Learning

3 code implementations2 Jul 2018 Eugene Bagdasaryan, Andreas Veit, Yiqing Hua, Deborah Estrin, Vitaly Shmatikov

An attacker selected in a single round of federated learning can cause the global model to immediately reach 100% accuracy on the backdoor task.

Anomaly Detection Data Poisoning +2

Exploiting Unintended Feature Leakage in Collaborative Learning

1 code implementation10 May 2018 Luca Melis, Congzheng Song, Emiliano De Cristofaro, Vitaly Shmatikov

First, we show that an adversarial participant can infer the presence of exact data points -- for example, specific locations -- in others' training data (i. e., membership inference).

Federated Learning

Chiron: Privacy-preserving Machine Learning as a Service

no code implementations15 Mar 2018 Tyler Hunt, Congzheng Song, Reza Shokri, Vitaly Shmatikov, Emmett Witchel

Existing ML-as-a-service platforms require users to reveal all training data to the service operator.

Cryptography and Security

Fooling OCR Systems with Adversarial Text Images

no code implementations15 Feb 2018 Congzheng Song, Vitaly Shmatikov

We demonstrate that state-of-the-art optical character recognition (OCR) based on deep learning is vulnerable to adversarial images.

Adversarial Text Optical Character Recognition +1

Machine Learning Models that Remember Too Much

1 code implementation22 Sep 2017 Congzheng Song, Thomas Ristenpart, Vitaly Shmatikov

In this setting, we design and implement practical algorithms, some of them very similar to standard ML techniques such as regularization and data augmentation, that "memorize" information about the training dataset in the model yet the model is as accurate and predictive as a conventionally trained model.

BIG-bench Machine Learning Data Augmentation +3

Membership Inference Attacks against Machine Learning Models

11 code implementations18 Oct 2016 Reza Shokri, Marco Stronati, Congzheng Song, Vitaly Shmatikov

We quantitatively investigate how machine learning models leak information about the individual data records on which they were trained.

BIG-bench Machine Learning General Classification +2

Defeating Image Obfuscation with Deep Learning

no code implementations1 Sep 2016 Richard McPherson, Reza Shokri, Vitaly Shmatikov

We demonstrate that modern image recognition methods based on artificial neural networks can recover hidden information from images protected by various forms of obfuscation.

Deep Learning Privacy Preserving

Can we still avoid automatic face detection?

9 code implementations14 Feb 2016 Michael J. Wilber, Vitaly Shmatikov, Serge Belongie

In this setting, is it still possible for privacy-conscientious users to avoid automatic face detection and recognition?

Face Detection Face Recognition

How To Break Anonymity of the Netflix Prize Dataset

no code implementations18 Oct 2006 Arvind Narayanan, Vitaly Shmatikov

We present a new class of statistical de-anonymization attacks against high-dimensional micro-data, such as individual preferences, recommendations, transaction records and so on.

Cryptography and Security Databases

Cannot find the paper you are looking for? You can Submit a new open access paper.