Dialogue Safety Prediction

Determine the safety of a given dialogue context.

Most implemented papers

ProsocialDialog: A Prosocial Backbone for Conversational Agents

skywalker023/prosocial-dialog 25 May 2022

With this dataset, we introduce a dialogue safety detection module, Canary, capable of generating RoTs given conversational context, and a socially-informed dialogue agent, Prost.

ALERT: A Comprehensive Benchmark for Assessing Large Language Models' Safety through Red Teaming

babelscape/alert 6 Apr 2024

When building Large Language Models (LLMs), it is paramount to bear safety in mind and protect them with guardrails.

Benchmarking Llama2, Mistral, Gemma and GPT for Factuality, Toxicity, Bias and Propensity for Hallucinations

innodatalabs/innodata-llm-safety 15 Apr 2024

In this research, we used OpenAI GPT as point of comparison since it excels at all levels of safety.