no code implementations • 3 Mar 2025 • Zhengyuan Jiang, Yuepeng Hu, Yuchen Yang, Yinzhi Cao, Neil Zhenqiang Gong
Text-to-Image models may generate harmful content, such as pornographic images, particularly when unsafe prompts are submitted.
no code implementations • 24 Nov 2024 • Qi Li, Cheng-Long Wang, Yinzhi Cao, Di Wang
Our findings reveal, for the first time, that even if data in the redundant set is solely used before model training, its pruning-phase membership status can still be detected through attacks.
no code implementations • 4 Nov 2024 • Zihao Zhao, Yijiang Li, Yuchen Yang, Wenqing Zhang, Nuno Vasconcelos, Yinzhi Cao
Machine unlearning--enabling a trained model to forget specific data--is crucial for addressing biased data and adhering to privacy regulations like the General Data Protection Regulation (GDPR)'s "right to be forgotten".
1 code implementation • 4 Oct 2024 • Zihao Zhao, Yuchen Yang, Yijiang Li, Yinzhi Cao
This approach effectively guides the model through complex multi-hop questions with chains of related facts.
1 code implementation • 14 Jul 2024 • Yuchen Yang, Kwonjoon Lee, Behzad Dariush, Yinzhi Cao, Shao-Yuan Lo
In the induction stage, the LLM is fed with few-shot normal reference samples and then summarizes these normal patterns to induce a set of rules for detecting anomalies.
Ranked #2 on
Video Anomaly Detection
on UCSD Ped2
1 code implementation • 10 May 2024 • Bo Hui, Haolin Yuan, Neil Gong, Philippe Burlina, Yinzhi Cao
As a result, a natural attack, called prompt leaking, is to steal the system prompt from an LLM application, which compromises the developer's intellectual property.
no code implementations • 19 Mar 2024 • Cheng-Long Wang, Qi Li, Zihang Xiang, Yinzhi Cao, Di Wang
Our analysis, conducted across multiple unlearning benchmarks, reveals that these algorithms inconsistently fulfill their unlearning commitments due to two main issues: 1) unlearning new data can significantly affect the unlearning utility of previously requested data, and 2) approximate algorithms fail to ensure equitable unlearning utility across different groups.
1 code implementation • 10 Jan 2024 • Yue Huang, Lichao Sun, Haoran Wang, Siyuan Wu, Qihui Zhang, Yuan Li, Chujie Gao, Yixin Huang, Wenhan Lyu, Yixuan Zhang, Xiner Li, Zhengliang Liu, Yixin Liu, Yijue Wang, Zhikun Zhang, Bertie Vidgen, Bhavya Kailkhura, Caiming Xiong, Chaowei Xiao, Chunyuan Li, Eric Xing, Furong Huang, Hao liu, Heng Ji, Hongyi Wang, huan zhang, Huaxiu Yao, Manolis Kellis, Marinka Zitnik, Meng Jiang, Mohit Bansal, James Zou, Jian Pei, Jian Liu, Jianfeng Gao, Jiawei Han, Jieyu Zhao, Jiliang Tang, Jindong Wang, Joaquin Vanschoren, John Mitchell, Kai Shu, Kaidi Xu, Kai-Wei Chang, Lifang He, Lifu Huang, Michael Backes, Neil Zhenqiang Gong, Philip S. Yu, Pin-Yu Chen, Quanquan Gu, ran Xu, Rex Ying, Shuiwang Ji, Suman Jana, Tianlong Chen, Tianming Liu, Tianyi Zhou, William Wang, Xiang Li, Xiangliang Zhang, Xiao Wang, Xing Xie, Xun Chen, Xuyu Wang, Yan Liu, Yanfang Ye, Yinzhi Cao, Yong Chen, Yue Zhao
This paper introduces TrustLLM, a comprehensive study of trustworthiness in LLMs, including principles for different dimensions of trustworthiness, established benchmark, evaluation, and analysis of trustworthiness for mainstream LLMs, and discussion of open challenges and future directions.
1 code implementation • 20 May 2023 • Yuchen Yang, Bo Hui, Haolin Yuan, Neil Gong, Yinzhi Cao
Text-to-image generative models such as Stable Diffusion and DALL$\cdot$E raise many ethical concerns due to the generation of harmful images such as Not-Safe-for-Work (NSFW) ones.
1 code implementation • 26 Oct 2022 • Haolin Yuan, Bo Hui, Yuchen Yang, Philippe Burlina, Neil Zhenqiang Gong, Yinzhi Cao
Federated learning (FL) allows multiple clients to collaboratively train a deep learning model.
no code implementations • 28 Feb 2022 • Haolin Yuan, Armin Hadzic, William Paul, Daniella Villegas de Flores, Philip Mathew, John Aucott, Yinzhi Cao, Philippe Burlina
Skin lesions can be an early indicator of a wide range of infectious and other diseases.
no code implementations • 4 Mar 2021 • William Paul, Yinzhi Cao, Miaomiao Zhang, Phil Burlina
Machine learning (ML) models used in medical imaging diagnostics can be vulnerable to a variety of privacy attacks, including membership inference attacks, that lead to violations of regulations governing the use of medical data and threaten to compromise their effective deployment in the clinic.
1 code implementation • 5 Jan 2021 • Bo Hui, Yuchen Yang, Haolin Yuan, Philippe Burlina, Neil Zhenqiang Gong, Yinzhi Cao
The success of the former heavily depends on the quality of the shadow model, i. e., the transferability between the shadow and the target; the latter, given only blackbox probing access to the target model, cannot make an effective inference of unknowns, compared with MI attacks using shadow models, due to the insufficient number of qualified samples labeled with ground truth membership information.
2 code implementations • ECCV 2020 • Chenglin Yang, Adam Kortylewski, Cihang Xie, Yinzhi Cao, Alan Yuille
PatchAttack induces misclassifications by superimposing small textured patches on the input image.
no code implementations • 5 Dec 2017 • Kexin Pei, Linjie Zhu, Yinzhi Cao, Junfeng Yang, Carl Vondrick, Suman Jana
Finally, we show that retraining using the safety violations detected by VeriVis can reduce the average number of violations up to 60. 2%.
3 code implementations • 18 May 2017 • Kexin Pei, Yinzhi Cao, Junfeng Yang, Suman Jana
First, we introduce neuron coverage for systematically measuring the parts of a DL system exercised by test inputs.