Search Results for author: Xinlei He

Found 36 papers, 20 papers with code

The Rising Threat to Emerging AI-Powered Search Engines

no code implementations7 Feb 2025 Zeren Luo, Zifan Peng, Yule Liu, Zhen Sun, Mingchen Li, Jingyi Zheng, Xinlei He

However, we observe that these AIPSEs raise risks such as quoting malicious content or citing malicious websites, leading to harmful or unverified information dissemination.

SoK: Benchmarking Poisoning Attacks and Defenses in Federated Learning

1 code implementation6 Feb 2025 Heyi Zhang, Yule Liu, Xinlei He, Jun Wu, Tianshuo Cong, Xinyi Huang

This paper aims to provide a unified benchmark and analysis of defenses against DPAs and MPAs, clarifying the distinction between these two similar but slightly distinct domains.

Benchmarking Data Poisoning +2

CL-attack: Textual Backdoor Attacks via Cross-Lingual Triggers

1 code implementation26 Dec 2024 Jingyi Zheng, Tianyi Hu, Tianshuo Cong, Xinlei He

Backdoor attacks significantly compromise the security of large language models by triggering them to output specific and controlled content.

Backdoor Attack Sentence

On the Generalization Ability of Machine-Generated Text Detectors

1 code implementation23 Dec 2024 Yule Liu, Zhiyuan Zhong, Yifan Liao, Zhen Sun, Jingyi Zheng, Jiaheng Wei, Qingyuan Gong, Fenghua Tong, Yang Chen, Yang Zhang, Xinlei He

The rise of large language models (LLMs) has raised concerns about machine-generated text (MGT), including ethical and practical issues like plagiarism and misinformation.

Benchmarking Misinformation

Privacy-Preserving Federated Learning via Homomorphic Adversarial Networks

no code implementations2 Dec 2024 Wenhan Dong, Chao Lin, Xinlei He, Xinyi Huang, Shengmin Xu

Privacy-preserving federated learning (PPFL) aims to train a global model for multiple clients while maintaining their data privacy.

Federated Learning Privacy Preserving

Quantized Delta Weight Is Safety Keeper

no code implementations29 Nov 2024 Yule Liu, Zhen Sun, Xinlei He, Xinyi Huang

Regarding the security risks, user-defined fine-tuning can introduce security vulnerabilities, such as alignment issues, backdoor attacks, and hallucinations.

Quantization

Membership Inference Attack Against Masked Image Modeling

no code implementations13 Aug 2024 Zheng Li, Xinlei He, Ning Yu, Yang Zhang

Masked Image Modeling (MIM) has achieved significant success in the realm of self-supervised learning (SSL) for visual recognition.

Inference Attack Membership Inference Attack +1

JailbreakEval: An Integrated Toolkit for Evaluating Jailbreak Attempts Against Large Language Models

1 code implementation13 Jun 2024 Delong Ran, JinYuan Liu, Yichen Gong, Jingyi Zheng, Xinlei He, Tianshuo Cong, Anyu Wang

In summary, we regard JailbreakEval to be a catalyst that simplifies the evaluation process in jailbreak research and fosters an inclusive standard for jailbreak evaluation within the community.

Link Stealing Attacks Against Inductive Graph Neural Networks

1 code implementation9 May 2024 Yixin Wu, Xinlei He, Pascal Berrang, Mathias Humbert, Michael Backes, Neil Zhenqiang Gong, Yang Zhang

This paper fills the gap by conducting a systematic privacy analysis of inductive GNNs through the lens of link stealing attacks, one of the most popular attacks that are specifically designed for GNNs.

Graph Neural Network

Have You Merged My Model? On The Robustness of Large Language Model IP Protection Methods Against Model Merging

1 code implementation8 Apr 2024 Tianshuo Cong, Delong Ran, Zesen Liu, Xinlei He, JinYuan Liu, Yichen Gong, Qi Li, Anyu Wang, XiaoYun Wang

Model merging is a promising lightweight model empowerment technique that does not rely on expensive computing devices (e. g., GPUs) or require the collection of specific training data.

Language Modeling Language Modelling +4

A Comprehensive Study of Privacy Risks in Curriculum Learning

no code implementations16 Oct 2023 Joann Qiongna Chen, Xinlei He, Zheng Li, Yang Zhang, Zhou Li

Training a machine learning model with data following a meaningful order, i. e., from easy to hard, has been proven to be effective in accelerating the training process and achieving better model performance.

Attribute Inference Attack +3

You Only Prompt Once: On the Capabilities of Prompt Learning on Large Language Models to Tackle Toxic Content

1 code implementation10 Aug 2023 Xinlei He, Savvas Zannettou, Yun Shen, Yang Zhang

We find that prompt learning achieves around 10\% improvement in the toxicity classification task compared to the baselines, while for the toxic span detection task we find better performance to the best baseline (0. 643 vs. 0. 640 in terms of $F_1$-score).

Generated Graph Detection

1 code implementation13 Jun 2023 Yihan Ma, Zhikun Zhang, Ning Yu, Xinlei He, Michael Backes, Yun Shen, Yang Zhang

Graph generative models become increasingly effective for data distribution approximation and data augmentation.

Data Augmentation Face Swapping +1

Generative Watermarking Against Unauthorized Subject-Driven Image Synthesis

no code implementations13 Jun 2023 Yihan Ma, Zhengyu Zhao, Xinlei He, Zheng Li, Michael Backes, Yang Zhang

In particular, to help the watermark survive the subject-driven synthesis, we incorporate the synthesis process in learning GenWatermark by fine-tuning the detector with synthesized images for a specific subject.

Image Generation

Unsafe Diffusion: On the Generation of Unsafe Images and Hateful Memes From Text-To-Image Models

1 code implementation23 May 2023 Yiting Qu, Xinyue Shen, Xinlei He, Michael Backes, Savvas Zannettou, Yang Zhang

Our evaluation result shows that 24% of the generated images using DreamBooth are hateful meme variants that present the features of the original hateful meme and the target individual/community; these generated images are comparable to hateful meme variants collected from the real world.

MGTBench: Benchmarking Machine-Generated Text Detection

3 code implementations26 Mar 2023 Xinlei He, Xinyue Shen, Zeyuan Chen, Michael Backes, Yang Zhang

Extensive evaluations on public datasets with curated texts generated by various powerful LLMs such as ChatGPT-turbo and Claude demonstrate the effectiveness of different detection methods.

Benchmarking Question Answering +4

A Plot is Worth a Thousand Words: Model Information Stealing Attacks via Scientific Plots

1 code implementation23 Feb 2023 Boyang Zhang, Xinlei He, Yun Shen, Tianhao Wang, Yang Zhang

Given the simplicity and effectiveness of the attack method, our study indicates scientific plots indeed constitute a valid side channel for model information stealing attacks.

valid

Fine-Tuning Is All You Need to Mitigate Backdoor Attacks

no code implementations18 Dec 2022 Zeyang Sha, Xinlei He, Pascal Berrang, Mathias Humbert, Yang Zhang

Backdoor attacks represent one of the major threats to machine learning models.

On the Evolution of (Hateful) Memes by Means of Multimodal Contrastive Learning

2 code implementations13 Dec 2022 Yiting Qu, Xinlei He, Shannon Pierson, Michael Backes, Yang Zhang, Savvas Zannettou

The dissemination of hateful memes online has adverse effects on social media platforms and the real world.

Contrastive Learning

Backdoor Attacks in the Supply Chain of Masked Image Modeling

no code implementations4 Oct 2022 Xinyue Shen, Xinlei He, Zheng Li, Yun Shen, Michael Backes, Yang Zhang

Different from previous work, we are the first to systematically threat modeling on SSL in every phase of the model supply chain, i. e., pre-training, release, and downstream phases.

Contrastive Learning Self-Supervised Learning

Data Poisoning Attacks Against Multimodal Encoders

1 code implementation30 Sep 2022 Ziqing Yang, Xinlei He, Zheng Li, Michael Backes, Mathias Humbert, Pascal Berrang, Yang Zhang

Extensive evaluations on different datasets and model architectures show that all three attacks can achieve significant attack performance while maintaining model utility in both visual and linguistic modalities.

Contrastive Learning Data Poisoning

Auditing Membership Leakages of Multi-Exit Networks

no code implementations23 Aug 2022 Zheng Li, Yiyong Liu, Xinlei He, Ning Yu, Michael Backes, Yang Zhang

Furthermore, we propose a hybrid attack that exploits the exit information to improve the performance of existing attacks.

Membership-Doctor: Comprehensive Assessment of Membership Inference Against Machine Learning Models

no code implementations22 Aug 2022 Xinlei He, Zheng Li, Weilin Xu, Cory Cornelius, Yang Zhang

Finally, we find that data augmentation degrades the performance of existing attacks to a larger extent, and we propose an adaptive attack using augmentation to train shadow and attack models that improve attack performance.

Data Augmentation

Semi-Leak: Membership Inference Attacks Against Semi-supervised Learning

1 code implementation25 Jul 2022 Xinlei He, Hongbin Liu, Neil Zhenqiang Gong, Yang Zhang

The results show that early stopping can mitigate the membership inference attack, but with the cost of model's utility degradation.

Data Augmentation Inference Attack +1

SSLGuard: A Watermarking Scheme for Self-supervised Learning Pre-trained Encoders

1 code implementation27 Jan 2022 Tianshuo Cong, Xinlei He, Yang Zhang

Recent research has shown that the machine learning model's copyright is threatened by model stealing attacks, which aim to train a surrogate model to mimic the behavior of a given model.

Self-Supervised Learning

Model Stealing Attacks Against Inductive Graph Neural Networks

1 code implementation15 Dec 2021 Yun Shen, Xinlei He, Yufei Han, Yang Zhang

Graph neural networks (GNNs), a new family of machine learning (ML) models, have been proposed to fully leverage graph data to build powerful applications.

Node-Level Membership Inference Attacks Against Graph Neural Networks

no code implementations10 Feb 2021 Xinlei He, Rui Wen, Yixin Wu, Michael Backes, Yun Shen, Yang Zhang

To fully utilize the information contained in graph data, a new family of machine learning (ML) models, namely graph neural networks (GNNs), has been introduced.

BIG-bench Machine Learning

Quantifying and Mitigating Privacy Risks of Contrastive Learning

1 code implementation8 Feb 2021 Xinlei He, Yang Zhang

Our experimental results show that contrastive models trained on image datasets are less vulnerable to membership inference attacks but more vulnerable to attribute inference attacks compared to supervised models.

Attribute BIG-bench Machine Learning +4

ML-Doctor: Holistic Risk Assessment of Inference Attacks Against Machine Learning Models

1 code implementation4 Feb 2021 Yugeng Liu, Rui Wen, Xinlei He, Ahmed Salem, Zhikun Zhang, Michael Backes, Emiliano De Cristofaro, Mario Fritz, Yang Zhang

As a result, we lack a comprehensive picture of the risks caused by the attacks, e. g., the different scenarios they can be applied to, the common factors that influence their performance, the relationship among them, or the effectiveness of possible defenses.

Attribute BIG-bench Machine Learning +3

Stealing Links from Graph Neural Networks

1 code implementation5 May 2020 Xinlei He, Jinyuan Jia, Michael Backes, Neil Zhenqiang Gong, Yang Zhang

In this work, we propose the first attacks to steal a graph from the outputs of a GNN model that is trained on the graph.

Fraud Detection Recommendation Systems

Cannot find the paper you are looking for? You can Submit a new open access paper.