Paper

Active Deep Learning Attacks under Strict Rate Limitations for Online API Calls

Machine learning has been applied to a broad range of applications and some of them are available online as application programming interfaces (APIs) with either free (trial) or paid subscriptions. In this paper, we study adversarial machine learning in the form of back-box attacks on online classifier APIs. We start with a deep learning based exploratory (inference) attack, which aims to build a classifier that can provide similar classification results (labels) as the target classifier. To minimize the difference between the labels returned by the inferred classifier and the target classifier, we show that the deep learning based exploratory attack requires a large number of labeled training data samples. These labels can be collected by calling the online API, but usually there is some strict rate limitation on the number of allowed API calls. To mitigate the impact of limited training data, we develop an active learning approach that first builds a classifier based on a small number of API calls and uses this classifier to select samples to further collect their labels. Then, a new classifier is built using more training data samples. This updating process can be repeated multiple times. We show that this active learning approach can build an adversarial classifier with a small statistical difference from the target classifier using only a limited number of training data samples. We further consider evasion and causative (poisoning) attacks based on the inferred classifier that is built by the exploratory attack. Evasion attack determines samples that the target classifier is likely to misclassify, whereas causative attack provides erroneous training data samples to reduce the reliability of the re-trained classifier. The success of these attacks show that adversarial machine learning emerges as a feasible threat in the realistic case with limited training data.

Results in Papers With Code
(↓ scroll down to see all results)