Backdoor attacks inject maliciously constructed data into a training set so that, at test time, the trained model misclassifies inputs patched with a backdoor trigger as an adversarially-desired target class.


Reflection Backdoor: A Natural Backdoor Attack on Deep Neural Networks

A backdoor attack installs a backdoor into the victim model by injecting a backdoor pattern into a small proportion of the training data.

Hidden Trigger Backdoor Attacks

Backdoor attacks are a form of adversarial attacks on deep networks where the attacker provides poisoned data to the victim to train the model with, and then activates the attack by showing a specific small trigger pattern at the test time.

DBA: Distributed Backdoor Attacks against Federated Learning

Compared to standard centralized backdoors, we show that DBA is substantially more persistent and stealthy against FL on diverse datasets such as finance and image data.

Embedding and Extraction of Knowledge in Tree Ensemble Classifiers

Whilst, as the increasing use of machine learning models in security-critical applications, the embedding and extraction of malicious knowledge are equivalent to the notorious backdoor attack and its defence, respectively.

ONION: A Simple and Effective Defense Against Textual Backdoor Attacks

Nevertheless, there are few studies on defending against textual backdoor attacks.

Deep Feature Space Trojan Attack of Neural Networks by Controlled Detoxification

Trojan (backdoor) attack is a form of adversarial attack on deep neural networks where the attacker provides victims with a model trained/retrained on malicious data.

LIRA: Learnable, Imperceptible and Robust Backdoor Attacks

Under this optimization framework, the trigger generator function will learn to manipulate the input with imperceptible noise to preserve the model performance on the clean data and maximize the attack success rate on the poisoned data.

Targeted Attack against Deep Neural Networks via Flipping Limited Weight Bits

By utilizing the latest technique in integer programming, we equivalently reformulate this BIP problem as a continuous optimization problem, which can be effectively and efficiently solved using the alternating direction method of multipliers (ADMM) method.

Hidden Killer: Invisible Textual Backdoor Attacks with Syntactic Trigger

As far as we know, almost all existing textual backdoor attack methods insert additional contents into normal samples as triggers, which causes the trigger-embedded samples to be detected and the backdoor attacks to be blocked without much effort.

BadEncoder: Backdoor Attacks to Pre-trained Encoders in Self-Supervised Learning

In particular, our BadEncoder injects backdoors into a pre-trained image encoder such that the downstream classifiers built based on the backdoored image encoder for different downstream tasks simultaneously inherit the backdoor behavior.