The Medical Information Mart for Intensive Care III (MIMIC-III) dataset is a large, de-identified and publicly-available collection of medical records. Each record in the dataset includes ICD-9 codes, which identify diagnoses and procedures performed. Each code is partitioned into sub-codes, which often include specific circumstantial details. The dataset consists of 112,000 clinical reports records (average length 709.3 tokens) and 1,159 top-level ICD-9 codes. Each report is assigned to 7.6 codes, on average. Data includes vital signs, medications, laboratory measurements, observations and notes charted by care providers, fluid balance, procedure codes, diagnostic codes, imaging reports, hospital length of stay, survival data, and more.
898 PAPERS • 8 BENCHMARKS
This dataset is created from MIMIC-III (Medical Information Mart for Intensive Care III) and contains simulated patient admission notes. The clinical notes contain information about a patient at admission time to the ICU and are labelled for four outcome prediction tasks: Diagnoses at discharge, procedures performed, in-hospital mortality and length-of-stay.
2 PAPERS • 4 BENCHMARKS