In this work, we propose MultiGuard, the first provably robust defense against adversarial examples to multi-label classification.
Our key idea is to divide the clients into groups, learn a global model for each group of clients using any existing federated learning method, and take a majority vote among the global models to classify a test input.
The results show that early stopping can mitigate the membership inference attack, but with the cost of model's utility degradation.
FLDetector aims to detect and remove the majority of the malicious clients such that a Byzantine-robust FL method can learn an accurate global model using the remaining clients.
In this work, we propose PoisonedEncoder, a data poisoning attack to contrastive learning.
Specifically, we assume the attacker injects fake clients to a federated learning system and sends carefully crafted fake local model updates to the cloud server during training, such that the learnt global model has low accuracy for many indiscriminate test inputs.
A pre-trained encoder may be deemed confidential because its training requires lots of data and computation resources as well as its public release may facilitate misuse of AI, e. g., for deepfakes generation.
We therefore propose HERO, a Hessian-enhanced robust optimization method, to minimize the Hessian eigenvalues through a gradient-based training process, simultaneously improving the generalization and quantization performance.
A key limitation of passive detection is that it cannot detect fake faces that are generated by new deepfake generation methods.
EncoderMI can be used 1) by a data owner to audit whether its (public) data was used to pre-train an image encoder without its authorization or 2) by an attacker to compromise privacy of the training data when it is private/sensitive.
In particular, our BadEncoder injects backdoors into a pre-trained image encoder such that the downstream classifiers built based on the backdoored image encoder for different downstream tasks simultaneously inherit the backdoor behavior.
Existing studies mainly focused on improving the detection performance in non-adversarial settings, leaving security of deepfake detection in adversarial settings largely unexplored.
With the rapid development of these services in the last two decades, users have accumulated a massive amount of behavior data.
Inspired by the idea of vector quantization that uses cluster centroids to approximate items, we propose LISA (LInear-time Self Attention), which enjoys both the effectiveness of vanilla self-attention and the efficiency of sparse attention.
Our first major theoretical contribution is that we show PointGuard provably predicts the same label for a 3D point cloud when the number of adversarially modified, added, and/or deleted points is bounded.
Our empirical results show that the proposed defenses can substantially reduce the estimation errors of the data poisoning attacks.
We show that our ensemble federated learning with any base federated learning algorithm is provably secure against malicious clients.
Specifically, we formulate our attack as an optimization problem, such that the injected ratings would maximize the number of normal users to whom the target items are recommended.
The success of the former heavily depends on the quality of the shadow model, i. e., the transferability between the shadow and the target; the latter, given only blackbox probing access to the target model, cannot make an effective inference of unknowns, compared with MI attacks using shadow models, due to the insufficient number of qualified samples labeled with ground truth membership information.
Finally, the service provider computes the average of the normalized local model updates weighted by their trust scores as a global model update, which is used to update the global model.
In this work, we aim to address the key limitation of existing pMRF-based methods.
Moreover, our evaluation results on MNIST and CIFAR10 show that the intrinsic certified robustness guarantees of kNN and rNN outperform those provided by state-of-the-art certified defenses.
For instance, our method can build a classifier that achieves a certified top-3 accuracy of 69. 2\% on ImageNet when an attacker can arbitrarily perturb 5 pixels of a testing image.
Moreover, to be robust against post-processing, we leverage Turbo codes, a type of error-correcting codes, to encode the message before embedding it to the DNN classifier.
Specifically, we prove the certified robustness guarantee of any GNN for both node and graph classifications against structural perturbation.
Cryptography and Security
Bagging, a popular ensemble learning framework, randomly creates some subsamples of the training data, trains a base model for each subsample using a base learner, and takes majority vote among the base models when making predictions.
Specifically, we show that bagging with an arbitrary base learning algorithm provably predicts the same label for a testing example when the number of modified, deleted, and/or inserted training examples is bounded by a threshold.
In this work, we propose the first attacks to steal a graph from the outputs of a GNN model that is trained on the graph.
Specifically, in this work, we study the feasibility and effectiveness of certifying robustness against backdoor attacks using a recent technique called randomized smoothing.
Given the number of fake users the attacker can inject, we formulate the crafting of rating scores for the fake users as an optimization problem.
However, several recent studies showed that community detection is vulnerable to adversarial structural perturbation.
For example, our method can obtain an ImageNet classifier with a certified top-5 accuracy of 62. 8\% when the $\ell_2$-norms of the adversarial perturbations are less than 0. 5 (=127/255).
Our empirical results on four real-world datasets show that our attacks can substantially increase the error rates of the models learnt by the federated learning methods that were claimed to be robust against Byzantine failures of some client devices.
Local Differential Privacy (LDP) protocols enable an untrusted data collector to perform privacy-preserving data analytics.
Data Poisoning Cryptography and Security Distributed, Parallel, and Cluster Computing
Our key observation is that a DNN classifier can be uniquely represented by its classification boundary.
Specifically, given a black-box access to the target classifier, the attacker trains a binary classifier, which takes a data sample's confidence score vector predicted by the target classifier as an input and predicts the data sample to be a member or non-member of the target classifier's training dataset.
To defend against inference attacks, we can add carefully crafted noise into the public data to turn them into adversarial examples, such that attackers' classifiers make incorrect predictions for the private data.
Results show that our attacks 1) can effectively evade graph-based classification methods; 2) do not require access to the true parameters, true training dataset, and/or complete graph; and 3) outperform the existing attack for evading collective classification methods and some graph neural network methods.
Cryptography and Security
To address the computational challenge, we propose to jointly learn the edge weights and propagate the reputation scores, which is essentially an approximate solution to the optimization problem.
To address the challenge, we formulate the poisoning attacks as an optimization problem, solving which determines the rating scores for the fake users.
Specifically, game-theoretic defenses require solving intractable optimization problems, while correlation-based defenses incur large utility loss of users' public data.
Our key observation is that adversarial examples are close to the classification boundary.