The Ubuntu Dialogue Corpus: A Large Dataset for Research in Unstructured Multi-Turn Dialogue Systems

WS 2015 Ryan LoweNissan PowIulian SerbanJoelle Pineau

This paper introduces the Ubuntu Dialogue Corpus, a dataset containing almost 1 million multi-turn dialogues, with a total of over 7 million utterances and 100 million words. This provides a unique resource for research into building dialogue managers based on neural language models that can make use of large amounts of unlabeled data... (read more)

PDF Abstract
TASK DATASET MODEL METRIC NAME METRIC VALUE GLOBAL RANK RESULT LEADERBOARD
Conversational Response Selection Ubuntu Dialogue (v1, Ranking) Dual-LSTM [email protected] 0.604 # 14
[email protected] 0.745 # 13
[email protected] 0.926 # 12
[email protected] 0.878 # 11