Model extraction

40 papers with code • 1 benchmarks • 2 datasets

Model extraction attacks, aka model stealing attacks, are used to extract the parameters from the target model. Ideally, the adversary will be able to steal and replicate a model that will have a very similar performance to the target model.

Libraries

Use these libraries to find Model extraction models and implementations

Most implemented papers

ACTIVETHIEF: Model Extraction Using Active Learning and Unannotated Public Data

iiscseal/activethief 7 Feb 2020

We demonstrate that (1) it is possible to use ACTIVETHIEF to extract deep classifiers trained on a variety of datasets from image and text domains, while querying the model with as few as 10-30% of samples from public datasets, (2) the resulting model exhibits a higher transferability success rate of adversarial examples than prior work, and (3) the attack evades detection by the state-of-the-art model extraction detection method, PRADA.

Cryptanalytic Extraction of Neural Network Models

google-research/cryptanalytic-model-extraction 10 Mar 2020

We argue that the machine learning problem of model extraction is actually a cryptanalytic problem in disguise, and should be studied as such.

MARLeME: A Multi-Agent Reinforcement Learning Model Extraction Library

dmitrykazhdan/MARLeME 16 Apr 2020

Multi-Agent Reinforcement Learning (MARL) encompasses a powerful class of methodologies that have been applied in a wide range of fields.

Model extraction from counterfactual explanations

aivodji/mrce 3 Sep 2020

Post-hoc explanation techniques refer to a posteriori methods that can be used to explain how black-box machine learning models produce their outcomes.

MEME: Generating RNN Model Explanations via Model Extraction

dmitrykazhdan/MEME-RNN-XAI NeurIPS Workshop HAMLETS 2020

Recurrent Neural Networks (RNNs) have achieved remarkable performance on a range of tasks.

Model Extraction Attacks on Graph Neural Networks: Taxonomy and Realization

TrustworthyGNN/MEA-GNN 24 Oct 2020

Machine learning models are shown to face a severe threat from Model Extraction Attacks, where a well-trained private model owned by a service provider can be stolen by an attacker pretending as a client.

Now You See Me (CME): Concept-based Model Extraction

dmitrykazhdan/CME 25 Oct 2020

Deep Neural Networks (DNNs) have achieved remarkable performance on a range of tasks.

MEME: Generating RNN Model Explanations via Model Extraction

dmitrykazhdan/MEME-RNN-XAI 13 Dec 2020

Recurrent Neural Networks (RNNs) have achieved remarkable performance on a range of tasks.

Model Extraction and Adversarial Transferability, Your BERT is Vulnerable!

xlhex/extract_and_transfer NAACL 2021

Finally, we investigate two defence strategies to protect the victim model and find that unless the performance of the victim model is sacrificed, both model ex-traction and adversarial transferability can effectively compromise the target models

Stateful Detection of Model Extraction Attacks

vardetect/vardetect 12 Jul 2021

Machine-Learning-as-a-Service providers expose machine learning (ML) models through application programming interfaces (APIs) to developers.