Search Results for author: Yinzhi Cao

Found 16 papers, 9 papers with code

Jailbreaking Safeguarded Text-to-Image Models via Large Language Models

no code implementations3 Mar 2025 Zhengyuan Jiang, Yuepeng Hu, Yuchen Yang, Yinzhi Cao, Neil Zhenqiang Gong

Text-to-Image models may generate harmful content, such as pornographic images, particularly when unsafe prompts are submitted.

Language Modeling Language Modelling +1

Data Lineage Inference: Uncovering Privacy Vulnerabilities of Dataset Pruning

no code implementations24 Nov 2024 Qi Li, Cheng-Long Wang, Yinzhi Cao, Di Wang

Our findings reveal, for the first time, that even if data in the redundant set is solely used before model training, its pruning-phase membership status can still be detected through attacks.

Pseudo-Probability Unlearning: Towards Efficient and Privacy-Preserving Machine Unlearning

no code implementations4 Nov 2024 Zihao Zhao, Yijiang Li, Yuchen Yang, Wenqing Zhang, Nuno Vasconcelos, Yinzhi Cao

Machine unlearning--enabling a trained model to forget specific data--is crucial for addressing biased data and adhering to privacy regulations like the General Data Protection Regulation (GDPR)'s "right to be forgotten".

Machine Unlearning Privacy Preserving

Follow the Rules: Reasoning for Video Anomaly Detection with Large Language Models

1 code implementation14 Jul 2024 Yuchen Yang, Kwonjoon Lee, Behzad Dariush, Yinzhi Cao, Shao-Yuan Lo

In the induction stage, the LLM is fed with few-shot normal reference samples and then summarizes these normal patterns to induce a set of rules for detecting anomalies.

Anomaly Detection Video Anomaly Detection

PLeak: Prompt Leaking Attacks against Large Language Model Applications

1 code implementation10 May 2024 Bo Hui, Haolin Yuan, Neil Gong, Philippe Burlina, Yinzhi Cao

As a result, a natural attack, called prompt leaking, is to steal the system prompt from an LLM application, which compromises the developer's intellectual property.

Language Modeling Language Modelling +1

Towards Lifecycle Unlearning Commitment Management: Measuring Sample-level Approximate Unlearning Completeness

no code implementations19 Mar 2024 Cheng-Long Wang, Qi Li, Zihang Xiang, Yinzhi Cao, Di Wang

Our analysis, conducted across multiple unlearning benchmarks, reveals that these algorithms inconsistently fulfill their unlearning commitments due to two main issues: 1) unlearning new data can significantly affect the unlearning utility of previously requested data, and 2) approximate algorithms fail to ensure equitable unlearning utility across different groups.

Computational Efficiency Machine Unlearning +1

SneakyPrompt: Jailbreaking Text-to-image Generative Models

1 code implementation20 May 2023 Yuchen Yang, Bo Hui, Haolin Yuan, Neil Gong, Yinzhi Cao

Text-to-image generative models such as Stable Diffusion and DALL$\cdot$E raise many ethical concerns due to the generation of harmful images such as Not-Safe-for-Work (NSFW) ones.

Reinforcement Learning (RL) Semantic Similarity +1

Defending Medical Image Diagnostics against Privacy Attacks using Generative Methods

no code implementations4 Mar 2021 William Paul, Yinzhi Cao, Miaomiao Zhang, Phil Burlina

Machine learning (ML) models used in medical imaging diagnostics can be vulnerable to a variety of privacy attacks, including membership inference attacks, that lead to violations of regulations governing the use of medical data and threaten to compromise their effective deployment in the clinic.

Diagnostic Generative Adversarial Network

Practical Blind Membership Inference Attack via Differential Comparisons

1 code implementation5 Jan 2021 Bo Hui, Yuchen Yang, Haolin Yuan, Philippe Burlina, Neil Zhenqiang Gong, Yinzhi Cao

The success of the former heavily depends on the quality of the shadow model, i. e., the transferability between the shadow and the target; the latter, given only blackbox probing access to the target model, cannot make an effective inference of unknowns, compared with MI attacks using shadow models, due to the insufficient number of qualified samples labeled with ground truth membership information.

Inference Attack Membership Inference Attack

Towards Practical Verification of Machine Learning: The Case of Computer Vision Systems

no code implementations5 Dec 2017 Kexin Pei, Linjie Zhu, Yinzhi Cao, Junfeng Yang, Carl Vondrick, Suman Jana

Finally, we show that retraining using the safety violations detected by VeriVis can reduce the average number of violations up to 60. 2%.

BIG-bench Machine Learning Medical Diagnosis

DeepXplore: Automated Whitebox Testing of Deep Learning Systems

3 code implementations18 May 2017 Kexin Pei, Yinzhi Cao, Junfeng Yang, Suman Jana

First, we introduce neuron coverage for systematically measuring the parts of a DL system exercised by test inputs.

Deep Learning Malware Detection +1

Cannot find the paper you are looking for? You can Submit a new open access paper.