Does Adversarial Transferability Indicate Knowledge Transferability?

28 Sep 2020 · Kaizhao Liang, Jacky Y. Zhang, Oluwasanmi O Koyejo, Bo Li ·

Despite the immense success that deep neural networks (DNNs) have achieved, \emph{adversarial examples}, which are perturbed inputs that aim to mislead DNNs to make mistakes, have recently led to great concerns. On the other hand, adversarial examples exhibit interesting phenomena, such as \emph{adversarial transferability}. DNNs also exhibit knowledge transfer, which is critical to improving learning efficiency and learning in domains that lack high-quality training data. To uncover the fundamental connections between these phenomena, we investigate and give an affirmative answer to the question: \emph{does adversarial transferability indicate knowledge transferability?} We theoretically analyze the relationship between adversarial transferability and knowledge transferability, and outline easily checkable sufficient conditions that identify when adversarial transferability indicates knowledge transferability. In particular, we show that composition with an affine function is sufficient to reduce the difference between two models when they possess high adversarial transferability. Furthermore, we provide empirical evaluation for different transfer learning scenarios on diverse datasets, showing a strong positive correlation between the adversarial transferability and knowledge transferability, thus illustrating that our theoretical insights are predictive of practice.

PDF Abstract