Dataset Overview vanilla.csv: Represents the interactions without specific role-play instructions. boss.csv: Interactions where ChatGPT plays the role of a user's boss. classmate.csv: Interactions with ChatGPT acting as the user's classmate. Each turn was coded with user motives of user responses, or the perceived naturalness of ChatGPT responses.
1 PAPER • NO BENCHMARKS YET
Click to add a brief description of the dataset (Markdown and LaTeX enabled).
We construct a dataset named CPED from 40 Chinese TV shows. CPED consists of multisource knowledge related to empathy and personal characteristic. This knowledge covers 13 emotions, gender, Big Five personality traits, 19 dialogue acts and other knowledge.
15 PAPERS • 3 BENCHMARKS
DuLeMon is a large-scale Chinese Long-term Memory Conversation dataset, which simulates long-term memory conversations and focuses on the ability to actively construct and utilize the user's and the bot's persona in a long-term interaction. DuLeMon contains about 27.5k human-human conversations, 449k utterances, and 12k persona grounding sentences. This corpus can be used to explore Long-term Memory Conversation, Personalized Dialogue, and Persona Extraction / Matching / Retrieval.
12 PAPERS • NO BENCHMARKS YET