SODA is a high-quality social dialogue dataset. In contrast to most existing crowdsourced, small-scale dialogue corpora, Soda distills 1.5M socially-grounded dialogues from a pre-trained language model (InstructGPT; Ouyang et al., ). Dialogues are distilled by contextualizing social commonsense knowledge from a knowledge graph (Atomic10x).

Source: SODA: Million-scale Dialogue Distillation with Social Commonsense Contextualization

Papers


Paper Code Results Date Stars

Dataset Loaders


No data loaders found. You can submit your data loader here.

Tasks


Similar Datasets


License


Modalities


Languages