1 code implementation • 13 Jun 2024 • Delong Ran, JinYuan Liu, Yichen Gong, Jingyi Zheng, Xinlei He, Tianshuo Cong, Anyu Wang
Jailbreak attacks aim to induce Large Language Models (LLMs) to generate harmful responses for forbidden instructions, presenting severe misuse threats to LLMs.
1 code implementation • 8 Apr 2024 • Tianshuo Cong, Delong Ran, Zesen Liu, Xinlei He, JinYuan Liu, Yichen Gong, Qi Li, Anyu Wang, XiaoYun Wang
Model merging is a promising lightweight model empowerment technique that does not rely on expensive computing devices (e. g., GPUs) or require the collection of specific training data.
2 code implementations • 9 Nov 2023 • Yichen Gong, Delong Ran, JinYuan Liu, Conglei Wang, Tianshuo Cong, Anyu Wang, Sisi Duan, XiaoYun Wang
Ensuring the safety of artificial intelligence-generated content (AIGC) is a longstanding topic in the artificial intelligence (AI) community, and the safety concerns associated with Large Language Models (LLMs) have been widely investigated.