XL-Sum is a comprehensive and diverse dataset for abstractive summarization comprising 1 million professionally annotated article-summary pairs from BBC, extracted using a set of carefully designed heuristics. The dataset covers 44 languages ranging from low to high-resource, for many of which no public dataset is currently available. XL-Sum is highly abstractive, concise, and of high quality, as indicated by human and intrinsic evaluation.
47 PAPERS • NO BENCHMARKS YET
license: apache-2.0 tags: human-feedback size_categories: 100K<n<1M pretty_name: OpenAssistant Conversations
15 PAPERS • NO BENCHMARKS YET
A corpus of 553k news articles from six Persian news websites and agencies with relatively high quality author extracted keyphrases, which is then filtered and cleaned to achieve higher quality keyphrases.
3 PAPERS • NO BENCHMARKS YET
Pn-summary is a dataset for Persian abstractive text summarization.
2 PAPERS • NO BENCHMARKS YET