WebBrain-Raw is a large-scale dataset built from English Wikipedia articles and their crawlable Wikipedia references. It comprises 153 zipped data chunks in which each line is a Wikipedia page with its reference articles.

Source: WebBrain: Learning to Generate Factually Correct Articles for Queries by Grounding on Large Web Corpus

Papers


Paper Code Results Date Stars

Dataset Loaders


No data loaders found. You can submit your data loader here.

Tasks


Similar Datasets


Modalities


Languages