Model extraction
56 papers with code • 1 benchmarks • 2 datasets
Model extraction attacks, aka model stealing attacks, are used to extract the parameters from the target model. Ideally, the adversary will be able to steal and replicate a model that will have a very similar performance to the target model.
Libraries
Use these libraries to find Model extraction models and implementationsMost implemented papers
Entangled Watermarks as a Defense against Model Extraction
Such pairs are watermarks, which are not sampled from the task distribution and are only known to the defender.
FedRolex: Model-Heterogeneous Federated Learning with Rolling Sub-Model Extraction
Most cross-device federated learning (FL) studies focus on the model-homogeneous setting where the global server model and local client models are identical.
Now You See Me (CME): Concept-based Model Extraction
Deep Neural Networks (DNNs) have achieved remarkable performance on a range of tasks.
Data-Free Model Extraction
Current model extraction attacks assume that the adversary has access to a surrogate dataset with characteristics similar to the proprietary data used to train the victim model.
Process Extraction from Text: Benchmarking the State of the Art and Paving the Way for Future Challenges
The extraction of process models from text refers to the problem of turning the information contained in an unstructured textual process descriptions into a formal representation, i. e., a process model.
Protecting Language Generation Models via Invisible Watermarking
We can then detect the secret message by probing a suspect model to tell if it is distilled from the protected one.
"Yes, My LoRD." Guiding Language Model Extraction with Locality Reinforced Distillation
Model extraction attacks (MEAs) on large language models (LLMs) have received increasing attention in recent research.
Stealing Machine Learning Models via Prediction APIs
In such attacks, an adversary with black-box access, but no prior knowledge of an ML model's parameters or training data, aims to duplicate the functionality of (i. e., "steal") the model.
An Approach for Process Model Extraction By Multi-Grained Text Classification
Process model extraction (PME) is a recently emerged interdiscipline between natural language processing (NLP) and business process management (BPM), which aims to extract process models from textual descriptions.
DAWN: Dynamic Adversarial Watermarking of Neural Networks
Existing watermarking schemes are ineffective against IP theft via model extraction since it is the adversary who trains the surrogate model.