CANARD is a dataset for question-in-context rewriting that consists of questions each given in a dialog context together with a context-independent rewriting of the question. The context of each question is the dialog utterences that precede the question. CANARD can be used to evaluate question rewriting models that handle important linguistic phenomena such as coreference and ellipsis resolution.
56 PAPERS • 1 BENCHMARK
The dataset contains 105,811 information-seeking conversations converted from MS MARCO. This dataset is constructed to relieve the data scarcity problem of conversational search to an extent. Considering the multi-intent problem and contextual information, this large-scale intent-oriented and context-aware dataset is automatically constructed based on the web search session data in MS MARCO. This dataset can be used to train and evaluate conversational search systems.
0 PAPER • NO BENCHMARKS YET