no code implementations • 27 Feb 2025 • Zixuan Weng, Xiaolong Jin, Jinyuan Jia, Xiangyu Zhang
Ensuring AI safety is crucial as large language models become increasingly integrated into real-world applications.
1 code implementation • 27 Feb 2025 • Lingyu Du, Yupei Liu, Jinyuan Jia, Guohao Lan
In such attacks, adversaries inject backdoor triggers by poisoning the training data, creating a backdoor vulnerability: the model performs normally with benign inputs, but produces manipulated gaze directions when a specific trigger is present.
no code implementations • 7 Jan 2025 • Yupei Liu, Yanting Wang, Jinyuan Jia
However, many studies showed that an attacker can embed a trojan into an encoder such that multiple downstream classifiers built based on the trojaned encoder simultaneously inherit the trojan behavior.
1 code implementation • 9 Dec 2024 • Bochuan Cao, Jinyuan Jia, Chuxuan Hu, Wenbo Guo, Zhen Xiang, Jinghui Chen, Bo Li, Dawn Song
Existing backdoor attacks require either retraining the classifier with some clean data or modifying the model's architecture.
no code implementations • 17 Nov 2024 • Minhua Lin, Enyan Dai, Junjie Xu, Jinyuan Jia, Xiang Zhang, Suhang Wang
As neural networks can memorize the training samples, the model parameters of GNNs have a high risk of leaking private training data.
no code implementations • 7 Nov 2024 • Lingyu Du, Yupei Liu, Jinyuan Jia, Guohao Lan
Deep regression models are used in a wide variety of safety-critical applications, but are vulnerable to backdoor attacks.
1 code implementation • 1 Aug 2024 • Lingyu Du, Jinyuan Jia, Xucong Zhang, Guohao Lan
Eye gaze contains rich information about human attention and cognitive processes.
1 code implementation • 4 Jul 2024 • Zhengyuan Jiang, Moyang Guo, Yuepeng Hu, Jinyuan Jia, Neil Zhenqiang Gong
In this work, we propose the first image watermarks with certified robustness guarantees against removal and forgery attacks.
1 code implementation • 5 Jun 2024 • Jiate Li, Meng Pang, Yun Dong, Jinyuan Jia, Binghui Wang
Explainable Graph Neural Network (GNN) has emerged recently to foster the trust of using GNNs.
1 code implementation • CVPR 2024 • Yuan Xiao, Shiqing Ma, Juan Zhai, Chunrong Fang, Jinyuan Jia, Zhenyu Chen
The results show that MaxLin outperforms state-of-the-art tools with up to 110. 60% improvement regarding the certified lower bound and 5. 13 $\times$ speedup for the same neural networks.
no code implementations • 31 May 2024 • Zhangchen Xu, Fengqing Jiang, Luyao Niu, Jinyuan Jia, Bo Li, Radha Poovendran
Specifically, we show that any malicious client utilizing ACE could manipulate the parameters of its local model such that it is evaluated to have a high contribution by the server, even when its local training data is indeed of low quality.
1 code implementation • CVPR 2024 • Yanting Wang, Hongye Fu, Wei Zou, Jinyuan Jia
Moreover, we compare our MMCert with a state-of-the-art certified defense extended from unimodal models.
1 code implementation • 14 Feb 2024 • Zhangchen Xu, Fengqing Jiang, Luyao Niu, Jinyuan Jia, Bill Yuchen Lin, Radha Poovendran
Our results show that SafeDecoding significantly reduces the attack success rate and harmfulness of jailbreak attacks without compromising the helpfulness of responses to benign user queries.
1 code implementation • 12 Feb 2024 • Wei Zou, Runpeng Geng, Binghui Wang, Jinyuan Jia
Based on this attack surface, we propose PoisonedRAG, the first knowledge corruption attack to RAG, where an attacker could inject a few malicious texts into the knowledge database of a RAG system to induce an LLM to generate an attacker-chosen target answer for an attacker-chosen target question.
no code implementations • 10 Jan 2024 • Zhangchen Xu, Fengqing Jiang, Luyao Niu, Jinyuan Jia, Radha Poovendran
Our results show that the global model learned with Brave in the presence of adversaries achieves comparable classification accuracy to a global model trained in the absence of any adversary.
1 code implementation • CVPR 2024 • Jinghuai Zhang, Hongbin Liu, Jinyuan Jia, Neil Zhenqiang Gong
In this work we take the first step to analyze the limitations of existing backdoor attacks and propose new DPBAs called CorruptEncoder to CL.
1 code implementation • 19 Nov 2023 • Hengzhi Pei, Jinyuan Jia, Wenbo Guo, Bo Li, Dawn Song
In this work, we propose TextGuard, the first provable defense against backdoor attacks on text classification.
no code implementations • 7 Nov 2023 • Fengqing Jiang, Zhangchen Xu, Luyao Niu, Boxin Wang, Jinyuan Jia, Bo Li, Radha Poovendran
Successful exploits of the identified vulnerabilities result in the users receiving responses tailored to the intent of a threat initiator.
1 code implementation • NeurIPS 2023 • Bochuan Cao, Changjiang Li, Ting Wang, Jinyuan Jia, Bo Li, Jinghui Chen
IMPRESS is based on the key observation that imperceptible perturbations could lead to a perceptible inconsistency between the original image and the diffusion-reconstructed image, which can be used to devise a new optimization strategy for purifying the image, which may weaken the protection of the original image from unauthorized data usage (e. g., style mimicking, malicious editing).
1 code implementation • 19 Oct 2023 • Yupei Liu, Yuqi Jia, Runpeng Geng, Jinyuan Jia, Neil Zhenqiang Gong
Existing attacks are special cases in our framework.
no code implementations • 2 Oct 2023 • Hangfan Zhang, Zhimeng Guo, Huaisheng Zhu, Bochuan Cao, Lu Lin, Jinyuan Jia, Jinghui Chen, Dinghao Wu
A natural question is "could alignment really prevent those open-sourced large language models from being misused to generate undesired content?''.
1 code implementation • 26 Mar 2023 • Jinyuan Jia, Yupei Liu, Yuepeng Hu, Neil Zhenqiang Gong
PORE can transform any existing recommender system to be provably robust against any untargeted data poisoning attacks, which aim to reduce the overall performance of a recommender system.
no code implementations • CVPR 2023 • Jinghuai Zhang, Jinyuan Jia, Hongbin Liu, Neil Zhenqiang Gong
Existing certified defenses against adversarial point clouds suffer from a key limitation: their certified robustness guarantees are probabilistic, i. e., they produce an incorrect certified robustness guarantee with some probability.
no code implementations • 7 Jan 2023 • Wenjie Qu, Jinyuan Jia, Neil Zhenqiang Gong
For the first question, we show that the cloud service only needs to provide two APIs, which we carefully design, to enable a client to certify the robustness of its downstream classifier with a minimal number of queries to the APIs.
no code implementations • 6 Dec 2022 • Hongbin Liu, Wenjie Qu, Jinyuan Jia, Neil Zhenqiang Gong
In this work, we perform the first systematic, principled measurement study to understand whether and when a pre-trained encoder can address the limitations of secure or privacy-preserving supervised learning algorithms.
2 code implementations • 15 Nov 2022 • Jinghuai Zhang, Hongbin Liu, Jinyuan Jia, Neil Zhenqiang Gong
In this work, we take the first step to analyze the limitations of existing backdoor attacks and propose new DPBAs called CorruptEncoder to CL.
no code implementations • 20 Oct 2022 • Xiaoyu Cao, Jinyuan Jia, Zaixi Zhang, Neil Zhenqiang Gong
Existing defenses focus on preventing a small number of malicious clients from poisoning the global model via robust federated learning methods and detecting malicious clients when there are a large number of them.
1 code implementation • 3 Oct 2022 • Jinyuan Jia, Wenjie Qu, Neil Zhenqiang Gong
In this work, we propose MultiGuard, the first provably robust defense against adversarial examples to multi-label classification.
no code implementations • 2 Oct 2022 • Xiaoyu Cao, Zaixi Zhang, Jinyuan Jia, Neil Zhenqiang Gong
Our key idea is to divide the clients into groups, learn a global model for each group of clients using any existing federated learning method, and take a majority vote among the global models to classify a test input.
1 code implementation • 19 Jul 2022 • Zaixi Zhang, Xiaoyu Cao, Jinyuan Jia, Neil Zhenqiang Gong
FLDetector aims to detect and remove the majority of the malicious clients such that a Byzantine-robust FL method can learn an accurate global model using the remaining clients.
no code implementations • 13 May 2022 • Hongbin Liu, Jinyuan Jia, Neil Zhenqiang Gong
In this work, we propose PoisonedEncoder, a data poisoning attack to contrastive learning.
1 code implementation • 15 Jan 2022 • Yupei Liu, Jinyuan Jia, Hongbin Liu, Neil Zhenqiang Gong
A pre-trained encoder may be deemed confidential because its training requires lots of data and computation resources as well as its public release may facilitate misuse of AI, e. g., for deepfakes generation.
no code implementations • 28 Oct 2021 • Jinyuan Jia, Hongbin Liu, Neil Zhenqiang Gong
A pre-trained foundation model is like an ``operating system'' of the AI ecosystem.
Anomaly Detection In Surveillance Videos
Self-Supervised Learning
no code implementations • 25 Aug 2021 • Hongbin Liu, Jinyuan Jia, Wenjie Qu, Neil Zhenqiang Gong
EncoderMI can be used 1) by a data owner to audit whether its (public) data was used to pre-train an image encoder without its authorization or 2) by an attacker to compromise privacy of the training data when it is private/sensitive.
6 code implementations • 1 Aug 2021 • Jinyuan Jia, Yupei Liu, Neil Zhenqiang Gong
In particular, our BadEncoder injects backdoors into a pre-trained image encoder such that the downstream classifiers built based on the backdoored image encoder for different downstream tasks simultaneously inherit the backdoor behavior.
no code implementations • CVPR 2021 • Hongbin Liu, Jinyuan Jia, Neil Zhenqiang Gong
Our first major theoretical contribution is that we show PointGuard provably predicts the same label for a 3D point cloud when the number of adversarially modified, added, and/or deleted points is bounded.
no code implementations • 3 Feb 2021 • Xiaoyu Cao, Jinyuan Jia, Neil Zhenqiang Gong
We show that our ensemble federated learning with any base federated learning algorithm is provably secure against malicious clients.
no code implementations • 24 Dec 2020 • Binghui Wang, Jinyuan Jia, Neil Zhenqiang Gong
In this work, we aim to address the key limitation of existing pMRF-based methods.
no code implementations • 7 Dec 2020 • Jinyuan Jia, Yupei Liu, Xiaoyu Cao, Neil Zhenqiang Gong
Moreover, our evaluation results on MNIST and CIFAR10 show that the intrinsic certified robustness guarantees of kNN and rNN outperform those provided by state-of-the-art certified defenses.
no code implementations • ICLR 2022 • Jinyuan Jia, Binghui Wang, Xiaoyu Cao, Hongbin Liu, Neil Zhenqiang Gong
For instance, our method can build a classifier that achieves a certified top-3 accuracy of 69. 2\% on ImageNet when an attacker can arbitrarily perturb 5 pixels of a testing image.
no code implementations • 26 Oct 2020 • Jinyuan Jia, Binghui Wang, Neil Zhenqiang Gong
Moreover, to be robust against post-processing, we leverage Turbo codes, a type of error-correcting codes, to encode the message before embedding it to the DNN classifier.
no code implementations • 24 Aug 2020 • Binghui Wang, Jinyuan Jia, Xiaoyu Cao, Neil Zhenqiang Gong
Specifically, we prove the certified robustness guarantee of any GNN for both node and graph classifications against structural perturbation.
Cryptography and Security
no code implementations • 22 Aug 2020 • Hongbin Liu, Jinyuan Jia, Neil Zhenqiang Gong
Bagging, a popular ensemble learning framework, randomly creates some subsamples of the training data, trains a base model for each subsample using a base learner, and takes majority vote among the base models when making predictions.
1 code implementation • 11 Aug 2020 • Jinyuan Jia, Xiaoyu Cao, Neil Zhenqiang Gong
Specifically, we show that bagging with an arbitrary base learning algorithm provably predicts the same label for a testing example when the number of modified, deleted, and/or inserted training examples is bounded by a threshold.
2 code implementations • 19 Jun 2020 • Zaixi Zhang, Jinyuan Jia, Binghui Wang, Neil Zhenqiang Gong
Specifically, we propose a \emph{subgraph based backdoor attack} to GNN for graph classification.
1 code implementation • 5 May 2020 • Xinlei He, Jinyuan Jia, Michael Backes, Neil Zhenqiang Gong, Yang Zhang
In this work, we propose the first attacks to steal a graph from the outputs of a GNN model that is trained on the graph.
no code implementations • 26 Feb 2020 • Binghui Wang, Xiaoyu Cao, Jinyuan Jia, Neil Zhenqiang Gong
Specifically, in this work, we study the feasibility and effectiveness of certifying robustness against backdoor attacks using a recent technique called randomized smoothing.
no code implementations • 9 Feb 2020 • Jinyuan Jia, Binghui Wang, Xiaoyu Cao, Neil Zhenqiang Gong
However, several recent studies showed that community detection is vulnerable to adversarial structural perturbation.
1 code implementation • ICLR 2020 • Jinyuan Jia, Xiaoyu Cao, Binghui Wang, Neil Zhenqiang Gong
For example, our method can obtain an ImageNet classifier with a certified top-5 accuracy of 62. 8\% when the $\ell_2$-norms of the adversarial perturbations are less than 0. 5 (=127/255).
no code implementations • 26 Nov 2019 • Minghong Fang, Xiaoyu Cao, Jinyuan Jia, Neil Zhenqiang Gong
Our empirical results on four real-world datasets show that our attacks can substantially increase the error rates of the models learnt by the federated learning methods that were claimed to be robust against Byzantine failures of some client devices.
no code implementations • 5 Nov 2019 • Xiaoyu Cao, Jinyuan Jia, Neil Zhenqiang Gong
Local Differential Privacy (LDP) protocols enable an untrusted data collector to perform privacy-preserving data analytics.
Data Poisoning
Cryptography and Security
Distributed, Parallel, and Cluster Computing
no code implementations • 28 Oct 2019 • Xiaoyu Cao, Jinyuan Jia, Neil Zhenqiang Gong
Our key observation is that a DNN classifier can be uniquely represented by its classification boundary.
3 code implementations • 23 Sep 2019 • Jinyuan Jia, Ahmed Salem, Michael Backes, Yang Zhang, Neil Zhenqiang Gong
Specifically, given a black-box access to the target classifier, the attacker trains a binary classifier, which takes a data sample's confidence score vector predicted by the target classifier as an input and predicts the data sample to be a member or non-member of the target classifier's training dataset.
no code implementations • 17 Sep 2019 • Jinyuan Jia, Neil Zhenqiang Gong
To defend against inference attacks, we can add carefully crafted noise into the public data to turn them into adversarial examples, such that attackers' classifiers make incorrect predictions for the private data.
no code implementations • 4 Dec 2018 • Binghui Wang, Jinyuan Jia, Neil Zhenqiang Gong
To address the computational challenge, we propose to jointly learn the edge weights and propagate the reputation scores, which is essentially an approximate solution to the optimization problem.
1 code implementation • 13 May 2018 • Jinyuan Jia, Neil Zhenqiang Gong
Specifically, game-theoretic defenses require solving intractable optimization problems, while correlation-based defenses incur large utility loss of users' public data.
no code implementations • CVPR 2015 • Chaoyang Wang, Long Zhao, Shuang Liang, Liqing Zhang, Jinyuan Jia, Yichen Wei
Hierarchical segmentation based object proposal methods have become an important step in modern object detection paradigm.