MedICaT is a dataset of medical images, captions, subfigure-subcaption annotations, and inline textual references. Figures and captions are extracted from open access articles in PubMed Central and corresponding reference text is derived from S2ORC. The dataset consists of: 217,060 figures from 131,410 open access papers 7507 subcaption and subfigure annotations for 2069 compound figures Inline references for ~25K figures in the ROCO dataset

Source: https://github.com/allenai/medicat

Papers


Paper Code Results Date Stars

Dataset Loaders


Tasks


Similar Datasets


License


  • Unknown

Modalities


Languages