Search Results for author: Qianqiao Xu

Found 1 papers, 0 papers with code

Learn to Disguise: Avoid Refusal Responses in LLM's Defense via a Multi-agent Attacker-Disguiser Game

no code implementations3 Apr 2024 Qianqiao Xu, Zhiliang Tian, Hongyan Wu, Zhen Huang, Yiping Song, Feng Liu, Dongsheng Li

In this paper, we propose a multi-agent attacker-disguiser game approach to achieve a weak defense mechanism that allows the large model to both safely reply to the attacker and hide the defense intent.

Prompt Engineering

Cannot find the paper you are looking for? You can Submit a new open access paper.