Trojan Attacks on Wireless Signal Classification with Adversarial Machine Learning

23 Oct 2019 · Kemal Davaslioglu, Yalin E. Sagduyu ·

We present a Trojan (backdoor or trapdoor) attack that targets deep learning applications in wireless communications. A deep learning classifier is considered to classify wireless signals using raw (I/Q) samples as features and modulation types as labels. An adversary slightly manipulates training data by inserting Trojans (i.e., triggers) to only few training data samples by modifying their phases and changing the labels of these samples to a target label. This poisoned training data is used to train the deep learning classifier. In test (inference) time, an adversary transmits signals with the same phase shift that was added as a trigger during training. While the receiver can accurately classify clean (unpoisoned) signals without triggers, it cannot reliably classify signals poisoned with triggers. This stealth attack remains hidden until activated by poisoned inputs (Trojans) to bypass a signal classifier (e.g., for authentication). We show that this attack is successful over different channel conditions and cannot be mitigated by simply preprocessing the training and test data with random phase variations. To detect this attack, activation based outlier detection is considered with statistical as well as clustering techniques. We show that the latter one can detect Trojan attacks even if few samples are poisoned.

PDF Abstract