RCV1 (Reuters Corpus Volume 1)

Introduced by David D. Lewis et al. in RCV1: A New Benchmark Collection for Text Categorization Research

The RCV1 dataset is a benchmark dataset on text categorization. It is a collection of newswire articles producd by Reuters in 1996-1997. It contains 804,414 manually labeled newswire documents, and categorized with respect to three controlled vocabularies: industries, topics and regions.

Source: Random Projections for Linear Support Vector Machines

Papers


Paper Code Results Date Stars

Dataset Loaders


Tasks


Similar Datasets


License


Modalities


Languages