VisualMRC is a visual machine reading comprehension dataset that proposes a task: given a question and a document image, a model produces an abstractive answer.
You can find more details, analyses, and baseline results in the paper, VisualMRC: Machine Reading Comprehension on Document Images, AAAI 2021.
Statistics: 10,197 images 30,562 QA pairs 10.53 average question tokens (tokenizing with NLTK tokenizer) 9.53 average answer tokens (tokenizing wit NLTK tokenizer) 151.46 average OCR tokens (tokenizing with NLTK tokenizer)
Paper | Code | Results | Date | Stars |
---|