Model extraction
40 papers with code • 1 benchmarks • 2 datasets
Model extraction attacks, aka model stealing attacks, are used to extract the parameters from the target model. Ideally, the adversary will be able to steal and replicate a model that will have a very similar performance to the target model.
Libraries
Use these libraries to find Model extraction models and implementationsMost implemented papers
ACTIVETHIEF: Model Extraction Using Active Learning and Unannotated Public Data
We demonstrate that (1) it is possible to use ACTIVETHIEF to extract deep classifiers trained on a variety of datasets from image and text domains, while querying the model with as few as 10-30% of samples from public datasets, (2) the resulting model exhibits a higher transferability success rate of adversarial examples than prior work, and (3) the attack evades detection by the state-of-the-art model extraction detection method, PRADA.
Cryptanalytic Extraction of Neural Network Models
We argue that the machine learning problem of model extraction is actually a cryptanalytic problem in disguise, and should be studied as such.
MARLeME: A Multi-Agent Reinforcement Learning Model Extraction Library
Multi-Agent Reinforcement Learning (MARL) encompasses a powerful class of methodologies that have been applied in a wide range of fields.
Model extraction from counterfactual explanations
Post-hoc explanation techniques refer to a posteriori methods that can be used to explain how black-box machine learning models produce their outcomes.
MEME: Generating RNN Model Explanations via Model Extraction
Recurrent Neural Networks (RNNs) have achieved remarkable performance on a range of tasks.
Model Extraction Attacks on Graph Neural Networks: Taxonomy and Realization
Machine learning models are shown to face a severe threat from Model Extraction Attacks, where a well-trained private model owned by a service provider can be stolen by an attacker pretending as a client.
Now You See Me (CME): Concept-based Model Extraction
Deep Neural Networks (DNNs) have achieved remarkable performance on a range of tasks.
MEME: Generating RNN Model Explanations via Model Extraction
Recurrent Neural Networks (RNNs) have achieved remarkable performance on a range of tasks.
Model Extraction and Adversarial Transferability, Your BERT is Vulnerable!
Finally, we investigate two defence strategies to protect the victim model and find that unless the performance of the victim model is sacrificed, both model ex-traction and adversarial transferability can effectively compromise the target models
Stateful Detection of Model Extraction Attacks
Machine-Learning-as-a-Service providers expose machine learning (ML) models through application programming interfaces (APIs) to developers.