WildReceipt is a collection of receipts. It contains, for each photo, of a list of OCRs - with bounding box, text, and class.
It contains 1765 photos, with 25 classes, and 50000 text boxes. The goal is to benchmark "key information extraction" - extracting key information from documents. There are two different modalities - text and visual features - which is an interesting problem. Potential uses - extracting information from documents.
The dataset is pending release.
Paper | Code | Results | Date | Stars |
---|