Deep Encryption: Protecting Pre-Trained Neural Networks with Confusion Neurons

29 Sep 2021 · Mengbiao Zhao, Shixiong Xu, Jianlong Chang, Lingxi Xie, Jie Chen, Qi Tian ·

Having consumed huge amounts of training data and computational resource, large-scale pre-trained models are often considered key assets of AI service providers. This raises an important problem: how to prevent these models from being maliciously copied when they are running on customers' computing device? We answer this question by adding a set of confusion neurons into the pre-trained model, where the position of these neurons is encoded into a few integers that are easy to be encrypted. We find that most often, a small portion of confusion neurons are able to effectively contaminate the pre-trained model. Thereafter, we extend our study to a bigger picture that the customers may develop algorithms to eliminate the effect of confusion neurons and recover the original network, and we show that our simple approach is somewhat capable of defending itself against the fine-tuning attack.

PDF Abstract