DangerousQA refers to a set of harmful questions used to evaluate the safety and behavior of large language models (LLMs) in generating responses. In the context of the RED-EVAL safety benchmark, DangerousQA consists of 200 harmful questions collected from various sources, such as those related to racism, stereotypes, sexism, legality, toxicity, and harm. These questions are used to test the ability of LLMs to handle sensitive and potentially harmful content and to assess their performance in generating appropriate responses to such prompts.
Paper | Code | Results | Date | Stars |
---|