RVL-CDIP_N_MP (RVL-CDIP-N multi-page)

Introduced by Landeghem et al. in Beyond Document Page Classification: Design, Datasets, and Challenges

RVL-CDIP_MP-N can serve its original goal as a covariate shift test set, now for multi-page document classification. We were able to retrieve the original full documents from DocumentCloud and Web Search.

It has the same label taxonomy as RVL-CDIP (16) with close to 1K documents in PDF format, averaging 10 pages per document.

Papers


Paper Code Results Date Stars

Dataset Loaders


No data loaders found. You can submit your data loader here.

Tasks


Similar Datasets


License


  • apache 2.0

Modalities


Languages