Search Results for author: Shiqing Ma

Found 33 papers, 17 papers with code

Rapid Optimization for Jailbreaking LLMs via Subconscious Exploitation and Echopraxia

1 code implementation8 Feb 2024 Guangyu Shen, Siyuan Cheng, Kaiyuan Zhang, Guanhong Tao, Shengwei An, Lu Yan, Zhuo Zhang, Shiqing Ma, Xiangyu Zhang

Large Language Models (LLMs) have become prevalent across diverse sectors, transforming human life with their extraordinary reasoning and comprehension abilities.

DREAM: Debugging and Repairing AutoML Pipelines

no code implementations31 Dec 2023 XiaoYu Zhang, Juan Zhai, Shiqing Ma, Chao Shen

In response to the challenge of model design, researchers proposed Automated Machine Learning (AutoML) systems, which automatically search for model architecture and hyperparameters for a given task.

AutoML

Elijah: Eliminating Backdoors Injected in Diffusion Models via Distribution Shift

1 code implementation27 Nov 2023 Shengwei An, Sheng-Yen Chou, Kaiyuan Zhang, QiuLing Xu, Guanhong Tao, Guangyu Shen, Siyuan Cheng, Shiqing Ma, Pin-Yu Chen, Tsung-Yi Ho, Xiangyu Zhang

Diffusion models (DM) have become state-of-the-art generative models because of their capability to generate high-quality images from noises without adversarial training.

DIAGNOSIS: Detecting Unauthorized Data Usages in Text-to-image Diffusion Models

1 code implementation6 Jul 2023 Zhenting Wang, Chen Chen, Lingjuan Lyu, Dimitris N. Metaxas, Shiqing Ma

To address this issue, we propose a method for detecting such unauthorized data usage by planting the injected memorization into the text-to-image diffusion models trained on the protected dataset.

Memorization

Alteration-free and Model-agnostic Origin Attribution of Generated Images

no code implementations29 May 2023 Zhenting Wang, Chen Chen, Yi Zeng, Lingjuan Lyu, Shiqing Ma

To overcome this problem, we first develop an alteration-free and model-agnostic origin attribution method via input reverse-engineering on image generation models, i. e., inverting the input of a particular model for a specific image.

Image Generation

NOTABLE: Transferable Backdoor Attacks Against Prompt-based NLP Models

1 code implementation28 May 2023 Kai Mei, Zheng Li, Zhenting Wang, Yang Zhang, Shiqing Ma

Such attacks can be easily affected by retraining on downstream tasks and with different prompting strategies, limiting the transferability of backdoor attacks.

CILIATE: Towards Fairer Class-based Incremental Learning by Dataset and Training Refinement

no code implementations9 Apr 2023 Xuanqi Gao, Juan Zhai, Shiqing Ma, Chao Shen, Yufei Chen, Shiwei Wang

The common practice leverages incremental learning (IL), e. g., Class-based Incremental Learning (CIL) that updates output labels, to update the model with new data and a limited number of old data.

Fairness Incremental Learning

UNICORN: A Unified Backdoor Trigger Inversion Framework

1 code implementation5 Apr 2023 Zhenting Wang, Kai Mei, Juan Zhai, Shiqing Ma

Then, it proposes a unified framework to invert backdoor triggers based on the formalization of triggers and the identified inner behaviors of backdoor models from our analysis.

Backdoor Attack

Detecting Backdoors in Pre-trained Encoders

1 code implementation CVPR 2023 Shiwei Feng, Guanhong Tao, Siyuan Cheng, Guangyu Shen, Xiangzhe Xu, Yingqi Liu, Kaiyuan Zhang, Shiqing Ma, Xiangyu Zhang

We show the effectiveness of our method on image encoders pre-trained on ImageNet and OpenAI's CLIP 400 million image-text pairs.

Self-Supervised Learning

Gradient Shaping: Enhancing Backdoor Attack Against Reverse Engineering

no code implementations29 Jan 2023 Rui Zhu, Di Tang, Siyuan Tang, Guanhong Tao, Shiqing Ma, XiaoFeng Wang, Haixu Tang

Finally, we perform both theoretical and experimental analysis, showing that the GRASP enhancement does not reduce the effectiveness of the stealthy attacks against the backdoor detection methods based on weight analysis, as well as other backdoor mitigation methods without using detection.

Backdoor Attack

BEAGLE: Forensics of Deep Learning Backdoor Attack for Better Defense

1 code implementation16 Jan 2023 Siyuan Cheng, Guanhong Tao, Yingqi Liu, Shengwei An, Xiangzhe Xu, Shiwei Feng, Guangyu Shen, Kaiyuan Zhang, QiuLing Xu, Shiqing Ma, Xiangyu Zhang

Attack forensics, a critical counter-measure for traditional cyber attacks, is hence of importance for defending model backdoor attacks.

Backdoor Attack

Backdoor Vulnerabilities in Normally Trained Deep Learning Models

no code implementations29 Nov 2022 Guanhong Tao, Zhenting Wang, Siyuan Cheng, Shiqing Ma, Shengwei An, Yingqi Liu, Guangyu Shen, Zhuo Zhang, Yunshu Mao, Xiangyu Zhang

We leverage 20 different types of injected backdoor attacks in the literature as the guidance and study their correspondences in normally trained models, which we call natural backdoor vulnerabilities.

Data Poisoning

Apple of Sodom: Hidden Backdoors in Superior Sentence Embeddings via Contrastive Learning

no code implementations20 Oct 2022 Xiaoyi Chen, Baisong Xin, Shengfang Zhai, Shiqing Ma, Qingni Shen, Zhonghai Wu

This paper finds that contrastive learning can produce superior sentence embeddings for pre-trained models but is also vulnerable to backdoor attacks.

Backdoor Attack Contrastive Learning +3

FairNeuron: Improving Deep Neural Network Fairness with Adversary Games on Selective Neurons

1 code implementation6 Apr 2022 Xuanqi Gao, Juan Zhai, Shiqing Ma, Chao Shen, Yufei Chen, Qian Wang

To solve this issue, there has been a number of work trying to improve model fairness by using an adversarial game in model level.

Fairness

Training with More Confidence: Mitigating Injected and Natural Backdoors During Training

1 code implementation13 Feb 2022 Zhenting Wang, Hailun Ding, Juan Zhai, Shiqing Ma

By further analyzing the training process and model architectures, we found that piece-wise linear functions cause this hyperplane surface.

Backdoor Attack

Complex Backdoor Detection by Symmetric Feature Differencing

1 code implementation CVPR 2022 Yingqi Liu, Guangyu Shen, Guanhong Tao, Zhenting Wang, Shiqing Ma, Xiangyu Zhang

Our results on the TrojAI competition rounds 2-4, which have patch backdoors and filter backdoors, show that existing scanners may produce hundreds of false positives (i. e., clean models recognized as trojaned), while our technique removes 78-100% of them with a small increase of false negatives by 0-30%, leading to 17-41% overall accuracy improvement.

Finding Deviated Behaviors of the Compressed DNN Models for Image Classifications

1 code implementation6 Dec 2021 Yongqiang Tian, Wuqi Zhang, Ming Wen, Shing-Chi Cheung, Chengnian Sun, Shiqing Ma, Yu Jiang

To this end, we propose DFLARE, a novel, search-based, black-box testing technique to automatically find triggering inputs that result in deviated behaviors in image classification tasks.

Image Classification Model Compression

TnT Attacks! Universal Naturalistic Adversarial Patches Against Deep Neural Network Systems

no code implementations19 Nov 2021 Bao Gia Doan, Minhui Xue, Shiqing Ma, Ehsan Abbasnejad, Damith C. Ranasinghe

Now, an adversary can arm themselves with a patch that is naturalistic, less malicious-looking, physically realizable, highly effective achieving high attack success rates, and universal.

BadNL: Backdoor Attacks Against NLP Models

no code implementations ICML Workshop AML 2021 Xiaoyi Chen, Ahmed Salem, Michael Backes, Shiqing Ma, Yang Zhang

For instance, using the Word-level triggers, our backdoor attack achieves a 100% attack success rate with only a utility drop of 0. 18%, 1. 26%, and 0. 19% on three benchmark sentiment analysis datasets.

Backdoor Attack Sentence +1

Backdoor Scanning for Deep Neural Networks through K-Arm Optimization

1 code implementation9 Feb 2021 Guangyu Shen, Yingqi Liu, Guanhong Tao, Shengwei An, QiuLing Xu, Siyuan Cheng, Shiqing Ma, Xiangyu Zhang

By iteratively and stochastically selecting the most promising labels for optimization with the guidance of an objective function, we substantially reduce the complexity, allowing to handle models with many classes.

Dynamic Backdoor Attacks Against Deep Neural Networks

no code implementations1 Jan 2021 Ahmed Salem, Rui Wen, Michael Backes, Shiqing Ma, Yang Zhang

In particular, BaN and c-BaN based on a novel generative network are the first two schemes that algorithmically generate triggers.

Deep Feature Space Trojan Attack of Neural Networks by Controlled Detoxification

2 code implementations21 Dec 2020 Siyuan Cheng, Yingqi Liu, Shiqing Ma, Xiangyu Zhang

Trojan (backdoor) attack is a form of adversarial attack on deep neural networks where the attacker provides victims with a model trained/retrained on malicious data.

Backdoor Attack

Deep Learning Backdoors

no code implementations16 Jul 2020 Shaofeng Li, Shiqing Ma, Minhui Xue, Benjamin Zi Hao Zhao

The trigger can take a plethora of forms, including a special object present in the image (e. g., a yellow pad), a shape filled with custom textures (e. g., logos with particular colors) or even image-wide stylizations with special filters (e. g., images altered by Nashville or Gotham filters).

Backdoor Attack

BadNL: Backdoor Attacks against NLP Models with Semantic-preserving Improvements

no code implementations1 Jun 2020 Xiaoyi Chen, Ahmed Salem, Dingfan Chen, Michael Backes, Shiqing Ma, Qingni Shen, Zhonghai Wu, Yang Zhang

In this paper, we perform a systematic investigation of backdoor attack on NLP models, and propose BadNL, a general NLP backdoor attack framework including novel attack methods.

Backdoor Attack BIG-bench Machine Learning +1

Dynamic Backdoor Attacks Against Machine Learning Models

no code implementations7 Mar 2020 Ahmed Salem, Rui Wen, Michael Backes, Shiqing Ma, Yang Zhang

Triggers generated by our techniques can have random patterns and locations, which reduce the efficacy of the current backdoor detection mechanisms.

Backdoor Attack BIG-bench Machine Learning

Attacks Meet Interpretability: Attribute-steered Detection of Adversarial Samples

1 code implementation NeurIPS 2018 Guanhong Tao, Shiqing Ma, Yingqi Liu, Xiangyu Zhang

Results show that our technique can achieve 94% detection accuracy for 7 different kinds of attacks with 9. 91% false positives on benign inputs.

Attribute Face Recognition +1

Cannot find the paper you are looking for? You can Submit a new open access paper.