Search Results for author: Jinyuan Jia

Found 45 papers, 15 papers with code

MMCert: Provable Defense against Adversarial Attacks to Multi-modal Models

1 code implementation28 Mar 2024 Yanting Wang, Hongye Fu, Wei Zou, Jinyuan Jia

Moreover, we compare our MMCert with a state-of-the-art certified defense extended from unimodal models.

Emotion Recognition Road Segmentation

SafeDecoding: Defending against Jailbreak Attacks via Safety-Aware Decoding

1 code implementation14 Feb 2024 Zhangchen Xu, Fengqing Jiang, Luyao Niu, Jinyuan Jia, Bill Yuchen Lin, Radha Poovendran

Our results show that SafeDecoding significantly reduces the attack success rate and harmfulness of jailbreak attacks without compromising the helpfulness of responses to benign user queries.

Chatbot Code Generation

PoisonedRAG: Knowledge Poisoning Attacks to Retrieval-Augmented Generation of Large Language Models

1 code implementation12 Feb 2024 Wei Zou, Runpeng Geng, Binghui Wang, Jinyuan Jia

We formulate knowledge poisoning attacks as an optimization problem, whose solution is a set of poisoned texts.

Hallucination Retrieval

Brave: Byzantine-Resilient and Privacy-Preserving Peer-to-Peer Federated Learning

no code implementations10 Jan 2024 Zhangchen Xu, Fengqing Jiang, Luyao Niu, Jinyuan Jia, Radha Poovendran

Our results show that the global model learned with Brave in the presence of adversaries achieves comparable classification accuracy to a global model trained in the absence of any adversary.

Federated Learning Image Classification +1

TextGuard: Provable Defense against Backdoor Attacks on Text Classification

1 code implementation19 Nov 2023 Hengzhi Pei, Jinyuan Jia, Wenbo Guo, Bo Li, Dawn Song

In this work, we propose TextGuard, the first provable defense against backdoor attacks on text classification.

Sentence text-classification +1

Identifying and Mitigating Vulnerabilities in LLM-Integrated Applications

no code implementations7 Nov 2023 Fengqing Jiang, Zhangchen Xu, Luyao Niu, Boxin Wang, Jinyuan Jia, Bo Li, Radha Poovendran

Successful exploits of the identified vulnerabilities result in the users receiving responses tailored to the intent of a threat initiator.

Code Completion

IMPRESS: Evaluating the Resilience of Imperceptible Perturbations Against Unauthorized Data Usage in Diffusion-Based Generative AI

1 code implementation NeurIPS 2023 Bochuan Cao, Changjiang Li, Ting Wang, Jinyuan Jia, Bo Li, Jinghui Chen

IMPRESS is based on the key observation that imperceptible perturbations could lead to a perceptible inconsistency between the original image and the diffusion-reconstructed image, which can be used to devise a new optimization strategy for purifying the image, which may weaken the protection of the original image from unauthorized data usage (e. g., style mimicking, malicious editing).

Image Generation

Prompt Injection Attacks and Defenses in LLM-Integrated Applications

1 code implementation19 Oct 2023 Yupei Liu, Yuqi Jia, Runpeng Geng, Jinyuan Jia, Neil Zhenqiang Gong

As a result, the literature lacks a systematic understanding of prompt injection attacks and their defenses.

On the Safety of Open-Sourced Large Language Models: Does Alignment Really Prevent Them From Being Misused?

no code implementations2 Oct 2023 Hangfan Zhang, Zhimeng Guo, Huaisheng Zhu, Bochuan Cao, Lu Lin, Jinyuan Jia, Jinghui Chen, Dinghao Wu

A natural question is "could alignment really prevent those open-sourced large language models from being misused to generate undesired content?''.

Text Generation

PORE: Provably Robust Recommender Systems against Data Poisoning Attacks

1 code implementation26 Mar 2023 Jinyuan Jia, Yupei Liu, Yuepeng Hu, Neil Zhenqiang Gong

PORE can transform any existing recommender system to be provably robust against any untargeted data poisoning attacks, which aim to reduce the overall performance of a recommender system.

Data Poisoning Recommendation Systems

PointCert: Point Cloud Classification with Deterministic Certified Robustness Guarantees

no code implementations CVPR 2023 Jinghuai Zhang, Jinyuan Jia, Hongbin Liu, Neil Zhenqiang Gong

Existing certified defenses against adversarial point clouds suffer from a key limitation: their certified robustness guarantees are probabilistic, i. e., they produce an incorrect certified robustness guarantee with some probability.

Autonomous Driving Classification +1

REaaS: Enabling Adversarially Robust Downstream Classifiers via Robust Encoder as a Service

no code implementations7 Jan 2023 Wenjie Qu, Jinyuan Jia, Neil Zhenqiang Gong

For the first question, we show that the cloud service only needs to provide two APIs, which we carefully design, to enable a client to certify the robustness of its downstream classifier with a minimal number of queries to the APIs.

Self-Supervised Learning

Pre-trained Encoders in Self-Supervised Learning Improve Secure and Privacy-preserving Supervised Learning

no code implementations6 Dec 2022 Hongbin Liu, Wenjie Qu, Jinyuan Jia, Neil Zhenqiang Gong

In this work, we perform the first systematic, principled measurement study to understand whether and when a pre-trained encoder can address the limitations of secure or privacy-preserving supervised learning algorithms.

Data Poisoning Machine Unlearning +2

CorruptEncoder: Data Poisoning based Backdoor Attacks to Contrastive Learning

no code implementations15 Nov 2022 Jinghuai Zhang, Hongbin Liu, Jinyuan Jia, Neil Zhenqiang Gong

In this work, we take the first step to analyze the limitations of existing backdoor attacks and propose new DPBAs called CorruptEncoder to CL.

Contrastive Learning Data Poisoning

FedRecover: Recovering from Poisoning Attacks in Federated Learning using Historical Information

no code implementations20 Oct 2022 Xiaoyu Cao, Jinyuan Jia, Zaixi Zhang, Neil Zhenqiang Gong

Existing defenses focus on preventing a small number of malicious clients from poisoning the global model via robust federated learning methods and detecting malicious clients when there are a large number of them.

Federated Learning

MultiGuard: Provably Robust Multi-label Classification against Adversarial Examples

1 code implementation3 Oct 2022 Jinyuan Jia, Wenjie Qu, Neil Zhenqiang Gong

In this work, we propose MultiGuard, the first provably robust defense against adversarial examples to multi-label classification.

Classification Multi-class Classification +1

FLCert: Provably Secure Federated Learning against Poisoning Attacks

no code implementations2 Oct 2022 Xiaoyu Cao, Zaixi Zhang, Jinyuan Jia, Neil Zhenqiang Gong

Our key idea is to divide the clients into groups, learn a global model for each group of clients using any existing federated learning method, and take a majority vote among the global models to classify a test input.

Federated Learning

FLDetector: Defending Federated Learning Against Model Poisoning Attacks via Detecting Malicious Clients

1 code implementation19 Jul 2022 Zaixi Zhang, Xiaoyu Cao, Jinyuan Jia, Neil Zhenqiang Gong

FLDetector aims to detect and remove the majority of the malicious clients such that a Byzantine-robust FL method can learn an accurate global model using the remaining clients.

Federated Learning Model Poisoning

StolenEncoder: Stealing Pre-trained Encoders in Self-supervised Learning

no code implementations15 Jan 2022 Yupei Liu, Jinyuan Jia, Hongbin Liu, Neil Zhenqiang Gong

A pre-trained encoder may be deemed confidential because its training requires lots of data and computation resources as well as its public release may facilitate misuse of AI, e. g., for deepfakes generation.

Self-Supervised Learning

EncoderMI: Membership Inference against Pre-trained Encoders in Contrastive Learning

no code implementations25 Aug 2021 Hongbin Liu, Jinyuan Jia, Wenjie Qu, Neil Zhenqiang Gong

EncoderMI can be used 1) by a data owner to audit whether its (public) data was used to pre-train an image encoder without its authorization or 2) by an attacker to compromise privacy of the training data when it is private/sensitive.

Contrastive Learning

BadEncoder: Backdoor Attacks to Pre-trained Encoders in Self-Supervised Learning

3 code implementations1 Aug 2021 Jinyuan Jia, Yupei Liu, Neil Zhenqiang Gong

In particular, our BadEncoder injects backdoors into a pre-trained image encoder such that the downstream classifiers built based on the backdoored image encoder for different downstream tasks simultaneously inherit the backdoor behavior.

Backdoor Attack Self-Supervised Learning

PointGuard: Provably Robust 3D Point Cloud Classification

no code implementations CVPR 2021 Hongbin Liu, Jinyuan Jia, Neil Zhenqiang Gong

Our first major theoretical contribution is that we show PointGuard provably predicts the same label for a 3D point cloud when the number of adversarially modified, added, and/or deleted points is bounded.

3D Point Cloud Classification Autonomous Driving +4

Provably Secure Federated Learning against Malicious Clients

no code implementations3 Feb 2021 Xiaoyu Cao, Jinyuan Jia, Neil Zhenqiang Gong

We show that our ensemble federated learning with any base federated learning algorithm is provably secure against malicious clients.

Federated Learning Human Activity Recognition

Certified Robustness of Nearest Neighbors against Data Poisoning and Backdoor Attacks

no code implementations7 Dec 2020 Jinyuan Jia, Yupei Liu, Xiaoyu Cao, Neil Zhenqiang Gong

Moreover, our evaluation results on MNIST and CIFAR10 show that the intrinsic certified robustness guarantees of kNN and rNN outperform those provided by state-of-the-art certified defenses.

Data Poisoning

Almost Tight L0-norm Certified Robustness of Top-k Predictions against Adversarial Perturbations

no code implementations ICLR 2022 Jinyuan Jia, Binghui Wang, Xiaoyu Cao, Hongbin Liu, Neil Zhenqiang Gong

For instance, our method can build a classifier that achieves a certified top-3 accuracy of 69. 2\% on ImageNet when an attacker can arbitrarily perturb 5 pixels of a testing image.

Recommendation Systems

Robust and Verifiable Information Embedding Attacks to Deep Neural Networks via Error-Correcting Codes

no code implementations26 Oct 2020 Jinyuan Jia, Binghui Wang, Neil Zhenqiang Gong

Moreover, to be robust against post-processing, we leverage Turbo codes, a type of error-correcting codes, to encode the message before embedding it to the DNN classifier.

Certified Robustness of Graph Neural Networks against Adversarial Structural Perturbation

no code implementations24 Aug 2020 Binghui Wang, Jinyuan Jia, Xiaoyu Cao, Neil Zhenqiang Gong

Specifically, we prove the certified robustness guarantee of any GNN for both node and graph classifications against structural perturbation.

Cryptography and Security

On the Intrinsic Differential Privacy of Bagging

no code implementations22 Aug 2020 Hongbin Liu, Jinyuan Jia, Neil Zhenqiang Gong

Bagging, a popular ensemble learning framework, randomly creates some subsamples of the training data, trains a base model for each subsample using a base learner, and takes majority vote among the base models when making predictions.

BIG-bench Machine Learning Ensemble Learning

Intrinsic Certified Robustness of Bagging against Data Poisoning Attacks

1 code implementation11 Aug 2020 Jinyuan Jia, Xiaoyu Cao, Neil Zhenqiang Gong

Specifically, we show that bagging with an arbitrary base learning algorithm provably predicts the same label for a testing example when the number of modified, deleted, and/or inserted training examples is bounded by a threshold.

Data Poisoning Ensemble Learning

Backdoor Attacks to Graph Neural Networks

2 code implementations19 Jun 2020 Zaixi Zhang, Jinyuan Jia, Binghui Wang, Neil Zhenqiang Gong

Specifically, we propose a \emph{subgraph based backdoor attack} to GNN for graph classification.

Backdoor Attack General Classification +2

Stealing Links from Graph Neural Networks

no code implementations5 May 2020 Xinlei He, Jinyuan Jia, Michael Backes, Neil Zhenqiang Gong, Yang Zhang

In this work, we propose the first attacks to steal a graph from the outputs of a GNN model that is trained on the graph.

Fraud Detection Recommendation Systems

On Certifying Robustness against Backdoor Attacks via Randomized Smoothing

no code implementations26 Feb 2020 Binghui Wang, Xiaoyu Cao, Jinyuan Jia, Neil Zhenqiang Gong

Specifically, in this work, we study the feasibility and effectiveness of certifying robustness against backdoor attacks using a recent technique called randomized smoothing.

Backdoor Attack

Certified Robustness for Top-k Predictions against Adversarial Perturbations via Randomized Smoothing

1 code implementation ICLR 2020 Jinyuan Jia, Xiaoyu Cao, Binghui Wang, Neil Zhenqiang Gong

For example, our method can obtain an ImageNet classifier with a certified top-5 accuracy of 62. 8\% when the $\ell_2$-norms of the adversarial perturbations are less than 0. 5 (=127/255).

Local Model Poisoning Attacks to Byzantine-Robust Federated Learning

no code implementations26 Nov 2019 Minghong Fang, Xiaoyu Cao, Jinyuan Jia, Neil Zhenqiang Gong

Our empirical results on four real-world datasets show that our attacks can substantially increase the error rates of the models learnt by the federated learning methods that were claimed to be robust against Byzantine failures of some client devices.

BIG-bench Machine Learning Data Poisoning +2

Data Poisoning Attacks to Local Differential Privacy Protocols

no code implementations5 Nov 2019 Xiaoyu Cao, Jinyuan Jia, Neil Zhenqiang Gong

Local Differential Privacy (LDP) protocols enable an untrusted data collector to perform privacy-preserving data analytics.

Data Poisoning Cryptography and Security Distributed, Parallel, and Cluster Computing

MemGuard: Defending against Black-Box Membership Inference Attacks via Adversarial Examples

3 code implementations23 Sep 2019 Jinyuan Jia, Ahmed Salem, Michael Backes, Yang Zhang, Neil Zhenqiang Gong

Specifically, given a black-box access to the target classifier, the attacker trains a binary classifier, which takes a data sample's confidence score vector predicted by the target classifier as an input and predicts the data sample to be a member or non-member of the target classifier's training dataset.

Inference Attack Membership Inference Attack

Defending against Machine Learning based Inference Attacks via Adversarial Examples: Opportunities and Challenges

no code implementations17 Sep 2019 Jinyuan Jia, Neil Zhenqiang Gong

To defend against inference attacks, we can add carefully crafted noise into the public data to turn them into adversarial examples, such that attackers' classifiers make incorrect predictions for the private data.

BIG-bench Machine Learning Inference Attack

Graph-based Security and Privacy Analytics via Collective Classification with Joint Weight Learning and Propagation

no code implementations4 Dec 2018 Binghui Wang, Jinyuan Jia, Neil Zhenqiang Gong

To address the computational challenge, we propose to jointly learn the edge weights and propagate the reputation scores, which is essentially an approximate solution to the optimization problem.

Attribute General Classification +2

AttriGuard: A Practical Defense Against Attribute Inference Attacks via Adversarial Machine Learning

1 code implementation13 May 2018 Jinyuan Jia, Neil Zhenqiang Gong

Specifically, game-theoretic defenses require solving intractable optimization problems, while correlation-based defenses incur large utility loss of users' public data.

Attribute BIG-bench Machine Learning

Object Proposal by Multi-Branch Hierarchical Segmentation

no code implementations CVPR 2015 Chaoyang Wang, Long Zhao, Shuang Liang, Liqing Zhang, Jinyuan Jia, Yichen Wei

Hierarchical segmentation based object proposal methods have become an important step in modern object detection paradigm.

Object object-detection +2

Cannot find the paper you are looking for? You can Submit a new open access paper.