Google1000

A collection of 1000 public domain volumes that were scanned as part of the Google Book Search project. It is being distributed to support research in a variety of disciplines. Each volume comes with the scanned images, OCR output, page tags and basic metadata. The volumes in this dataset are written in 4 languages: English, French, Italian and Spanish. This document describes the organization of the dataset and the file formats.

Papers


Paper Code Results Date Stars

Dataset Loaders


No data loaders found. You can submit your data loader here.

Tasks


Similar Datasets


License


  • Unknown

Modalities


Languages