Mimic-IV-ICD: A new benchmark for eXtreme MultiLabel Classification

Clinical notes are assigned ICD codes - sets of codes for diagnoses and procedures. In the recent years, predictive machine learning models have been built for automatic ICD coding. However, there is a lack of widely accepted benchmarks for automated ICD coding models based on large-scale public EHR data. This paper proposes a public benchmark suite for ICD-10 coding using a large EHR dataset derived from MIMIC-IV, the most recent public EHR dataset. We implement and compare several popular methods for ICD coding prediction tasks to standardize data preprocessing and establish a comprehensive ICD coding benchmark dataset. This approach fosters reproducibility and model comparison, accelerating progress toward employing automated ICD coding in future studies. Furthermore, we create a new ICD-9 benchmark using MIMIC-IV data, providing more data points and a higher number of ICD codes than MIMIC-III. Our open-source code offers easy access to data processing steps, benchmark creation, and experiment replication for those with MIMIC-IV access, providing insights, guidance, and protocols to efficiently develop ICD coding models.

PDF Abstract
Task Dataset Model Metric Name Metric Value Global Rank Result Benchmark
Medical Code Prediction MIMIC-IV-ICD-10-full LAAT Macro-AUC 92.96 # 3
Micro-AUC 99.14 # 3
Macro-F1 4.47 # 4
Micro-F1 55.40 # 4
Precision@8 66,97 # 5
Medical Code Prediction MIMIC-IV-ICD-10-full CAML Macro-AUC 89.91 # 1
Micro-AUC 98.79 # 5
Macro-F1 4.07 # 5
Micro-F1 52.67 # 5
Precision@8 64.43 # 4
Medical Code Prediction MIMIC-IV-ICD-10-full Joint LAAT Macro-AUC 93.64 # 4
Micro-AUC 99.27 # 2
Macro-F1 5.71 # 1
Micro-F1 55.89 # 3
Precision@8 66.89 # 3
Medical Code Prediction MIMIC-IV-ICD-10-full MSMN Macro-AUC 97.07 # 5
Micro-AUC 99.61 # 1
Macro-F1 5.42 # 2
Micro-F1 55.91 # 2
Precision@8 67.66 # 2
Medical Code Prediction MIMIC-IV-ICD-10-full PLM Macro-AUC 91.85 # 2
Micro-AUC 99.02 # 4
Macro-F1 4.90 # 3
Micro-F1 56.95 # 1
Precision@8 69.47 # 1
Medical Code Prediction MIMIC-IV-ICD10-top50 CAML F1 (micro) 67.56 # 5
F1 (macro) 64.30 # 5
AUC (Micro) 93.18 # 5
AUC (Macro) 91.05 # 5
Precision@5 59.58 # 5
Medical Code Prediction MIMIC-IV-ICD10-top50 PLM-ICD F1 (micro) 73.27 # 2
F1 (macro) 70.31 # 1
AUC (Micro) 95.69 # 1
AUC (Macro) 93.37 # 3
Precision@5 64.57 # 2
Medical Code Prediction MIMIC-IV-ICD10-top50 Joint LAAT F1 (micro) 72.85 # 3
F1 (macro) 68.41 # 3
AUC (Micro) 95.57 # 3
AUC (Macro) 93.39 # 2
Precision@5 64.49 # 3
Medical Code Prediction MIMIC-IV-ICD10-top50 LAAT F1 (micro) 72.56 # 4
F1 (macro) 68.15 # 4
AUC (Micro) 95.49 # 4
AUC (Macro) 93.21 # 4
Precision@5 64.39 # 4
Medical Code Prediction MIMIC-IV-ICD10-top50 MSMN F1 (micro) 74.15 # 1
F1 (macro) 69.01 # 2
AUC (Micro) 95.61 # 2
AUC (Macro) 93.60 # 1
Precision@5 65.16 # 1
Medical Code Prediction MIMIC-IV-ICD9-full MSMN F1 Micro 61.15 # 4
F1 Macro 13.94 # 3
Macro AUC 96.79 # 1
Micro AUC 99.56 # 1
Precision@8 68.89 # 2
Medical Code Prediction MIMIC-IV-ICD9-full PLM-ICD F1 Micro 62.45 # 5
F1 Macro 14.40 # 1
Macro AUC 96.61 # 2
Micro AUC 99.53 # 2
Precision@8 70.34 # 1
Medical Code Prediction MIMIC-IV-ICD9-full Joint LAAT F1 Micro 60.37 # 3
F1 Macro 14.17 # 2
Macro AUC 95.57 # 3
Micro AUC 99.49 # 3
Precision@8 67.46 # 4
Medical Code Prediction MIMIC-IV-ICD9-full LAAT F1 Micro 60.31 # 2
F1 Macro 13.12 # 4
Macro AUC 95.18 # 4
Micro AUC 99.47 # 4
Precision@8 67.47 # 3
Medical Code Prediction MIMIC-IV-ICD9-full CAML F1 Micro 57.28 # 1
F1 Macro 11.06 # 5
Macro AUC 93.45 # 5
Micro AUC 99.29 # 5
Precision@8 64.91 # 5
Medical Code Prediction MIMIC-IV-ICD9-top50 PLM-ICD F1 Micro 75.46 # 2
AUC Macro 94.97 # 2
AUC Micro 96.41 # 2
F1 Macro 71.35 # 2
Precision @5 62.44 # 2
Medical Code Prediction MIMIC-IV-ICD9-top50 MSMN F1 Micro 75.78 # 1
AUC Macro 95.13 # 1
AUC Micro 96.46 # 1
F1 Macro 71.85 # 1
Precision @5 62.60 # 1
Medical Code Prediction MIMIC-IV-ICD9-top50 Joint LAAT F1 Micro 74.33 # 4
AUC Macro 94.92 # 3
AUC Micro 96.31 # 3
F1 Macro 69.93 # 4
Precision @5 61.95 # 4
Medical Code Prediction MIMIC-IV-ICD9-top50 LAAT F1 Micro 74.46 # 3
AUC Macro 94.88 # 4
AUC Micro 96.29 # 4
F1 Macro 69.99 # 3
Precision @5 62.01 # 3
Medical Code Prediction MIMIC-IV-ICD9-top50 CAML F1 Micro 69.23 # 5
AUC Macro 93.07 # 5
AUC Micro 94.05 # 5
F1 Macro 65.33 # 5
Precision @5 58.64 # 5

Methods


No methods listed for this paper. Add relevant methods here