MedQuAD includes 47,457 medical question-answer pairs created from 12 NIH websites (e.g. cancer.gov, niddk.nih.gov, GARD, MedlinePlus Health Topics). The collection covers 37 question types (e.g. Treatment, Diagnosis, Side Effects) associated with diseases, drugs and other medical entities such as tests.
23 PAPERS • NO BENCHMARKS YET
The MedNLI dataset consists of the sentence pairs developed by Physicians from the Past Medical History section of MIMIC-III clinical notes annotated for Definitely True, Maybe True and Definitely False. The dataset contains 11,232 training, 1,395 development and 1,422 test instances. This provides a natural language inference task (NLI) grounded in the medical history of patients.
8 PAPERS • 2 BENCHMARKS