no code implementations • 5 Sep 2022 • Guanxiong Liu, Abdallah Khreishah, Fatima Sharadgah, Issa Khalil
Through mathematical analysis, we show that if the attacker is perfect in injecting the backdoor, the Trojan infected model will be trained to learn the appropriate prediction confidence bound, which is used to distinguish Trojan and benign inputs under arbitrary perturbations.