2 dataset results for Dialogue Safety Prediction

Most existing dialogue systems fail to respond properly to potentially unsafe user utterances by either ignoring or passively agreeing with them.

13 PAPERS • 1 BENCHMARK

rt-inod-jailbreaking

rt-inod-jailbreaking (Red Teaming Innodata Jailbreaking)

The Innodata Red Teaming Prompts aims to rigorously assess models’ factuality and safety. This dataset, due to its manual creation and breadth of coverage, facilitates a comprehensive examination of LLM performance across diverse scenarios.

1 PAPER • 1 BENCHMARK

Datasets

2 dataset results for Dialogue Safety Prediction