RCV1 (Reuters Corpus Volume 1)

Introduced by David D. Lewis et al. in RCV1: A New Benchmark Collection for Text Categorization Research

The RCV1 dataset is a benchmark dataset on text categorization. It is a collection of newswire articles producd by Reuters in 1996-1997. It contains 804,414 manually labeled newswire documents, and categorized with respect to three controlled vocabularies: industries, topics and regions.

Source: Random Projections for Linear Support Vector Machines

Papers


Paper Code Results Date Stars

Dataset Loaders


No data loaders found. You can submit your data loader here.

Tasks


Similar Datasets


License


Modalities


Languages