DangerousQA

Introduced by Bhardwaj et al. in Red-Teaming Large Language Models using Chain of Utterances for Safety-Alignment

DangerousQA refers to a set of harmful questions used to evaluate the safety and behavior of large language models (LLMs) in generating responses. In the context of the RED-EVAL safety benchmark, DangerousQA consists of 200 harmful questions collected from various sources, such as those related to racism, stereotypes, sexism, legality, toxicity, and harm. These questions are used to test the ability of LLMs to handle sensitive and potentially harmful content and to assess their performance in generating appropriate responses to such prompts.

Homepage

Benchmarks

Add a new result Link an existing benchmark

No benchmarks yet. Start a new benchmark or link an existing one.

Papers

Paper	Code	Results	Date	Stars

Dataset Loaders

Add Remove

No data loaders found. You can submit your data loader here.

Tasks

Similar Datasets

RedEval

XWINO

HarmfulQA

Usage

License

Unknown

DangerousQA

Benchmarks

Add a new result Link an existing benchmark

Papers

Dataset Loaders

Add Remove

Tasks

Similar Datasets

RedEval

XWINO

HarmfulQA

Usage

License

Modalities

Languages

DangerousQA

Benchmarks Edit Add a new result Link an existing benchmark

Papers

Dataset Loaders Edit Add Remove

Tasks Edit

Similar Datasets

RedEval

XWINO

HarmfulQA

Usage

License Edit

Modalities Edit

Languages Edit

Benchmarks

Add a new result Link an existing benchmark

Dataset Loaders

Add Remove

Tasks

License

Modalities

Languages