Backdoor Attack
77 papers with code • 0 benchmarks • 0 datasets
Backdoor attacks inject maliciously constructed data into a training set so that, at test time, the trained model misclassifies inputs patched with a backdoor trigger as an adversarially-desired target class.
Benchmarks
These leaderboards are used to track progress in Backdoor Attack
Libraries
Use these libraries to find Backdoor Attack models and implementationsMost implemented papers
Reflection Backdoor: A Natural Backdoor Attack on Deep Neural Networks
A backdoor attack installs a backdoor into the victim model by injecting a backdoor pattern into a small proportion of the training data.
Hidden Trigger Backdoor Attacks
Backdoor attacks are a form of adversarial attacks on deep networks where the attacker provides poisoned data to the victim to train the model with, and then activates the attack by showing a specific small trigger pattern at the test time.
DBA: Distributed Backdoor Attacks against Federated Learning
Compared to standard centralized backdoors, we show that DBA is substantially more persistent and stealthy against FL on diverse datasets such as finance and image data.
Embedding and Extraction of Knowledge in Tree Ensemble Classifiers
Whilst, as the increasing use of machine learning models in security-critical applications, the embedding and extraction of malicious knowledge are equivalent to the notorious backdoor attack and its defence, respectively.
ONION: A Simple and Effective Defense Against Textual Backdoor Attacks
Nevertheless, there are few studies on defending against textual backdoor attacks.
Deep Feature Space Trojan Attack of Neural Networks by Controlled Detoxification
Trojan (backdoor) attack is a form of adversarial attack on deep neural networks where the attacker provides victims with a model trained/retrained on malicious data.
LIRA: Learnable, Imperceptible and Robust Backdoor Attacks
Under this optimization framework, the trigger generator function will learn to manipulate the input with imperceptible noise to preserve the model performance on the clean data and maximize the attack success rate on the poisoned data.
Targeted Attack against Deep Neural Networks via Flipping Limited Weight Bits
By utilizing the latest technique in integer programming, we equivalently reformulate this BIP problem as a continuous optimization problem, which can be effectively and efficiently solved using the alternating direction method of multipliers (ADMM) method.
Hidden Killer: Invisible Textual Backdoor Attacks with Syntactic Trigger
As far as we know, almost all existing textual backdoor attack methods insert additional contents into normal samples as triggers, which causes the trigger-embedded samples to be detected and the backdoor attacks to be blocked without much effort.
BadEncoder: Backdoor Attacks to Pre-trained Encoders in Self-Supervised Learning
In particular, our BadEncoder injects backdoors into a pre-trained image encoder such that the downstream classifiers built based on the backdoored image encoder for different downstream tasks simultaneously inherit the backdoor behavior.