MS MARCO: A Human Generated MAchine Reading COmprehension Dataset

28 Nov 2016Payal BajajDaniel CamposNick CraswellLi DengJianfeng GaoXiaodong LiuRangan MajumderAndrew McNamaraBhaskar MitraTri NguyenMir RosenbergXia SongAlina StoicaSaurabh TiwaryTong Wang

We introduce a large scale MAchine Reading COmprehension dataset, which we name MS MARCO. The dataset comprises of 1,010,916 anonymized questions---sampled from Bing's search query logs---each with a human generated answer and 182,669 completely human rewritten generated answers... (read more)

PDF Abstract

Evaluation Results from the Paper


  Submit results from this paper to get state-of-the-art GitHub badges and help the community compare results to other papers.