RRS (Restoration-200k for Response Selection)

Introduced by Lan et al. in Exploring Dense Retrieval for Dialogue Response Selection
Train Validation Test Ranking Test
size 0.4M 50K 5K 800
pos:neg 1:1 1:9 1.2:8.8 -
avg turns 5.0 5.0 5.0 5.0

Ranking test set contains the high-quality responses that selected by some baselines, and their correlation with the conversation context are carefully annotated by 8 professional annotators (the average annotation scores are saved for ranking). For ranking test set, the metrics should be NDCG@3 and NDCG@5, since the correlation scores are provided. More details are available in the Appendix of the paper.


Paper Code Results Date Stars

Dataset Loaders

No data loaders found. You can submit your data loader here.


Similar Datasets


  • Unknown