Model extraction
44 papers with code • 1 benchmarks • 2 datasets
Model extraction attacks, aka model stealing attacks, are used to extract the parameters from the target model. Ideally, the adversary will be able to steal and replicate a model that will have a very similar performance to the target model.
Libraries
Use these libraries to find Model extraction models and implementationsMost implemented papers
Entangled Watermarks as a Defense against Model Extraction
Such pairs are watermarks, which are not sampled from the task distribution and are only known to the defender.
FedRolex: Model-Heterogeneous Federated Learning with Rolling Sub-Model Extraction
Most cross-device federated learning (FL) studies focus on the model-homogeneous setting where the global server model and local client models are identical.
Data-Free Model Extraction
Current model extraction attacks assume that the adversary has access to a surrogate dataset with characteristics similar to the proprietary data used to train the victim model.
Process Extraction from Text: Benchmarking the State of the Art and Paving the Way for Future Challenges
The extraction of process models from text refers to the problem of turning the information contained in an unstructured textual process descriptions into a formal representation, i. e., a process model.
Protecting Language Generation Models via Invisible Watermarking
We can then detect the secret message by probing a suspect model to tell if it is distilled from the protected one.
Stealing Machine Learning Models via Prediction APIs
In such attacks, an adversary with black-box access, but no prior knowledge of an ML model's parameters or training data, aims to duplicate the functionality of (i. e., "steal") the model.
An Approach for Process Model Extraction By Multi-Grained Text Classification
Process model extraction (PME) is a recently emerged interdiscipline between natural language processing (NLP) and business process management (BPM), which aims to extract process models from textual descriptions.
DAWN: Dynamic Adversarial Watermarking of Neural Networks
Existing watermarking schemes are ineffective against IP theft via model extraction since it is the adversary who trains the surrogate model.
Thieves on Sesame Street! Model Extraction of BERT-based APIs
We study the problem of model extraction in natural language processing, in which an adversary with only query access to a victim model attempts to reconstruct a local copy of that model.
Deep Neural Network Fingerprinting by Conferrable Adversarial Examples
We propose a fingerprinting method for deep neural network classifiers that extracts a set of inputs from the source model so that only surrogates agree with the source model on the classification of such inputs.