A Label Attention Model for ICD Coding from Clinical Text

13 Jul 2020  ·  Thanh Vu, Dat Quoc Nguyen, Anthony Nguyen ·

ICD coding is a process of assigning the International Classification of Disease diagnosis codes to clinical/medical notes documented by health professionals (e.g. clinicians). This process requires significant human resources, and thus is costly and prone to error. To handle the problem, machine learning has been utilized for automatic ICD coding. Previous state-of-the-art models were based on convolutional neural networks, using a single/several fixed window sizes. However, the lengths and interdependence between text fragments related to ICD codes in clinical text vary significantly, leading to the difficulty of deciding what the best window sizes are. In this paper, we propose a new label attention model for automatic ICD coding, which can handle both the various lengths and the interdependence of the ICD code related text fragments. Furthermore, as the majority of ICD codes are not frequently used, leading to the extremely imbalanced data issue, we additionally propose a hierarchical joint learning mechanism extending our label attention model to handle the issue, using the hierarchical relationships among the codes. Our label attention model achieves new state-of-the-art results on three benchmark MIMIC datasets, and the joint learning mechanism helps improve the performances for infrequent codes.

PDF Abstract

Datasets


Task Dataset Model Metric Name Metric Value Global Rank Result Benchmark
Medical Code Prediction MIMIC-III JointLAAT Macro-AUC 92.1 # 4
Micro-AUC 98.8 # 5
Macro-F1 10.7 # 4
Micro-F1 57.5 # 6
Precision@5 80.6 # 3
Precision@8 73.5 # 7
Precision@15 59.0 # 7
Medical Code Prediction MIMIC-III LAAT Macro-AUC 91.9 # 5
Micro-AUC 98.8 # 5
Macro-F1 9.9 # 7
Micro-F1 57.5 # 6
Precision@5 81.3 # 2
Precision@8 73.8 # 6
Precision@15 59.1 # 6

Methods


No methods listed for this paper. Add relevant methods here