Model extraction

40 papers with code • 1 benchmarks • 2 datasets

Model extraction attacks, aka model stealing attacks, are used to extract the parameters from the target model. Ideally, the adversary will be able to steal and replicate a model that will have a very similar performance to the target model.

Benchmarks

Add a Result

These leaderboards are used to track progress in Model extraction

Trend	Dataset	Best Model	Paper	Code	Compare
	UML Classes With Specs	three-step-original			See all

Libraries

Use these libraries to find Model extraction models and implementations

dmitrykazhdan/MEME-RNN-XAI

2 papers

Datasets

Most implemented papers

Most implemented Social Latest No code

Black-Box Attacks on Sequential Recommenders via Data-Free Model Extraction

yueeeeeeee/recsys-extraction-attack • • 1 Sep 2021

Under this setting, we propose an API-based model extraction method via limited-budget synthetic data generation and knowledge distillation.

Paper
Code

Protecting Intellectual Property of Language Generation APIs with Lexical Watermark

xlhex/nlg_api_watermark • • 5 Dec 2021

Nowadays, due to the breakthrough in natural language generation (NLG), including machine translation, document summarization, image captioning, etc NLG models have been encapsulated in cloud APIs to serve over half a billion people worldwide and process over one hundred billion word generations per day.

Paper
Code

On the Effectiveness of Dataset Watermarking in Adversarial Settings

ssg-research/conflicts-in-ml-protection-mechanisms • • 25 Feb 2022

We show that radioactive data can effectively survive model extraction attacks, which raises the possibility that it can be used for ML model ownership verification robust against model extraction.

Paper
Code

Stealing and Evading Malware Classifiers and Antivirus at Low False Positive Conditions

stratosphereips/model_extraction_malware • • 13 Apr 2022

We achieved good surrogates of the stand-alone classifiers with up to 99\% agreement with the target models, using less than 4% of the original training dataset.

Paper
Code

On the Difficulty of Defending Self-Supervised Learning against Model Extraction

cleverhans-lab/ssl-attacks-defenses • • 16 May 2022

We construct several novel attacks and find that approaches that train directly on a victim's stolen representations are query efficient and enable high accuracy for downstream models.

Paper
Code

Towards Automatically Extracting UML Class Diagrams from Natural Language Specifications

XsongyangX/uml-translation-3step • • 26 Oct 2022

To develop our approach, we create a dataset of UML class diagrams and their English specifications with the help of volunteers.

Paper
Code

Marich: A Query-efficient Distributionally Equivalent Model Extraction Attack using Public Data

debabrota-basu/marich • • 16 Feb 2023

We study design of black-box model extraction attacks that can send minimal number of queries from a publicly available dataset to a target ML model through a predictive API with an aim to create an informative and distributionally equivalent replica of the target.

Paper
Code

Are You Copying My Model? Protecting the Copyright of Large Language Models for EaaS via Backdoor Watermark

yjw1029/embmarker • • 17 May 2023

Companies have begun to offer Embedding as a Service (EaaS) based on these LLMs, which can benefit various natural language processing (NLP) tasks for customers.

Paper
Code

Weighted Automata Extraction and Explanation of Recurrent Neural Networks for Natural Language Tasks

weizeming/extract_wfa_from_rnn_for_nl • • 24 Jun 2023

In this paper, we propose a novel framework of Weighted Finite Automata (WFA) extraction and explanation to tackle the limitations for natural language tasks.

Paper
Code

FLuID: Mitigating Stragglers in Federated Learning using Invariant Dropout

iwang05/fluid • • NeurIPS 2023

Building on this dropout technique, we develop an adaptive training framework, Federated Learning using Invariant Dropout (FLuID).

Paper
Code

Model extraction

Benchmarks Add a Result

Libraries

Datasets

Most implemented papers

Content

Benchmarks

Add a Result