no code implementations • 19 Jun 2025 • Biao Yi, Tiansheng Huang, Sishuo Chen, Tong Li, Zheli Liu, Zhixuan Chu, Yiming Li
It is motivated by an intriguing observation (dubbed the probe concatenate effect), where concatenated triggered samples significantly reduce the refusal rate of the backdoored LLM towards a malicious probe, while non-triggered samples have little effect.
no code implementations • 22 May 2025 • Jianing Geng, Biao Yi, Zekun Fei, Tongxi Wu, Lihai Nie, Zheli Liu
Jailbreak attacks pose a serious threat to large language models (LLMs) by bypassing built-in safety mechanisms and leading to harmful outputs.
no code implementations • 22 May 2025 • Biao Yi, Tiansheng Huang, Baolei Zhang, Tong Li, Lihai Nie, Zheli Liu, Li Shen
Fine-tuning-as-a-service, while commercially successful for Large Language Model (LLM) providers, exposes models to harmful fine-tuning attacks.
no code implementations • 8 May 2025 • Biao Yi, Xavier Hu, Yurun Chen, Shengyu Zhang, Hongxia Yang, Fan Wu, Fei Wu
Cloud-based mobile agents powered by (multimodal) large language models ((M)LLMs) offer strong reasoning abilities but suffer from high latency and cost.
no code implementations • 30 Apr 2025 • Baolei Zhang, Haoran Xin, Minghong Fang, Zhuqing Liu, Biao Yi, Tong Li, Zheli Liu
This work pioneers the traceback of poisoned texts in RAG systems, providing a practical and promising defense mechanism to enhance their security.
1 code implementation • 14 Nov 2024 • Zekun Fei, Biao Yi, Jianing Geng, Ruiqi He, Lihai Nie, Zheli Liu
Embedding-as-a-Service (EaaS) has emerged as a successful business pattern but faces significant challenges related to various forms of copyright infringement, particularly, the API misuse and model extraction attacks.
no code implementations • 7 Nov 2024 • Fujie Zhang, Peiqi Yu, Biao Yi, Baolei Zhang, Tong Li, Zheli Liu
By utilizing appropriate prompts to guide changes in the structure related to text truthfulness within the LLM's internal states, we make this structure more salient and consistent across texts from different domains.
1 code implementation • 18 May 2024 • Biao Yi, Sishuo Chen, Yiming Li, Tong Li, Baolei Zhang, Zheli Liu
Backdoor attacks pose an increasingly severe security threat to Deep Neural Networks (DNNs) during their development stage.
no code implementations • 8 Mar 2022 • Tianyu Yang, Hanzhou Wu, Biao Yi, Guorui Feng, Xinpeng Zhang
In this paper, we propose a novel LS method to modify a given text by pivoting it between two different languages and embed secret data by applying a GLS-like information encoding strategy.
no code implementations • 26 Jul 2021 • Biao Yi, Hanzhou Wu, Guorui Feng, Xinpeng Zhang
Such kind of difference can be naturally captured by the language model used for generating stego texts.