WildReceipt is a collection of receipts. It contains, for each photo, of a list of OCRs - with bounding box, text, and class.

It contains 1765 photos, with 25 classes, and 50000 text boxes. The goal is to benchmark "key information extraction" - extracting key information from documents. There are two different modalities - text and visual features - which is an interesting problem. Potential uses - extracting information from documents.

The dataset is pending release.

Papers


Paper Code Results Date Stars

Dataset Loaders


No data loaders found. You can submit your data loader here.

Tasks


License


Modalities


Languages