Non-Intrusive Detection of Adversarial Deep Learning Attacks via Observer Networks

Recent studies have shown that deep learning models are vulnerable to specifically crafted adversarial inputs that are quasi-imperceptible to humans. In this letter, we propose a novel method to detect adversarial inputs, by augmenting the main classification network with multiple binary detectors (observer networks) which take inputs from the hidden layers of the original network (convolutional kernel outputs) and classify the input as clean or adversarial... (read more)

Results in Papers With Code
(↓ scroll down to see all results)