9 dataset results for Learning-To-Rank

Retrieval Question-Answering (ReQA) benchmark tests a model’s ability to retrieve relevant answers efficiently from a large set of documents.

10 PAPERS • NO BENCHMARKS YET

MQ2008

The MQ2008 dataset is a dataset for Learning to Rank. It contains 800 queries with labelled documents.

27 PAPERS • NO BENCHMARKS YET

MSLR-WEB10K

The MSLR-WEB10K dataset consists of 10,000 search queries over the documents from search results. The data also contains the values of 136 features and a corresponding user-labeled relevance factor on a scale of one to five with respect to each query-document pair. It is a subset of the MSLR-WEB30K dataset.

35 PAPERS • NO BENCHMARKS YET

Learning to Rank Challenge (Yahoo! Learning to Rank Challenge)

The Yahoo! Learning to Rank Challenge dataset consists of 709,877 documents encoded in 700 features and sampled from query logs of the Yahoo! search engine, spanning 29,921 queries.

24 PAPERS • NO BENCHMARKS YET

ART Dataset

ART Dataset (Abductive Reasoning in narrative Text)

ART consists of over 20k commonsense narrative contexts and 200k explanations.

9 PAPERS • NO BENCHMARKS YET

Flickr Cropping Dataset

The Flick Cropping Dataset consists of high quality cropping and pairwise ranking annotations used to evaluate the performance of automatic image cropping approaches.

5 PAPERS • NO BENCHMARKS YET

MQ2007

The MQ2007 dataset consists of queries, corresponding retrieved documents and labels provided by human experts. The possible relevance labels for each document are “relevant”, “partially relevant”, and “not relevant”.

30 PAPERS • NO BENCHMARKS YET

MultiFC

Publicly available dataset of naturally occurring factual claims for the purpose of automatic claim verification. It is collected from 26 fact checking websites in English, paired with textual sources and rich metadata, and labelled for veracity by human expert journalists.

18 PAPERS • NO BENCHMARKS YET

REFreSD

REFreSD (Rationalized English-French Semantic Divergences)

Consists of English-French sentence-pairs annotated with semantic divergence classes and token-level rationales.

3 PAPERS • NO BENCHMARKS YET

Datasets

9 dataset results for Learning-To-Rank