Secondary Use of Clinical Problem List Entries for Neural Network-Based Disease Code Assignment

27 Dec 2021 · Markus Kreuzthaler, Bastian Pfeifer, Diether Kramer, Stefan Schulz ·

Clinical information systems have become large repositories for semi-structured and partly annotated electronic health record data, which have reached a critical mass that makes them interesting for supervised data-driven neural network approaches. We explored automated coding of 50 character long clinical problem list entries using the International Classification of Diseases (ICD-10) and evaluated three different types of network architectures on the top 100 ICD-10 three-digit codes. A fastText baseline reached a macro-averaged F1-score of 0.83, followed by a character-level LSTM with a macro-averaged F1-score of 0.84. The top performing approach used a downstreamed RoBERTa model with a custom language model, yielding a macro-averaged F1-score of 0.88. A neural network activation analysis together with an investigation of the false positives and false negatives unveiled inconsistent manual coding as a main limiting factor.

PDF Abstract

Code

Add Remove Mark official

No code implementations yet. Submit your code now

Tasks

Add Remove

Language Modelling

Datasets

Add Datasets introduced or used in this paper

Results from the Paper

Edit

Submit results from this paper to get state-of-the-art GitHub badges and help the community compare results to other papers.

Methods

Add Remove

Adam • Attention Dropout • BERT • Dense Connections • Dropout • fastText • GELU • Layer Normalization • Linear Layer • Linear Warmup With Linear Decay • LSTM • Multi-Head Attention • Residual Connection • RoBERTa • Scaled Dot-Product Attention • Sigmoid Activation • Softmax • Tanh Activation • Weight Decay • WordPiece

Edit Social Preview

Secondary Use of Clinical Problem List Entries for Neural Network-Based Disease Code Assignment

Code Edit Add Remove Mark official

Tasks Edit Add Remove

Datasets Edit

Results from the Paper Edit

Methods Edit Add Remove

Code

Add Remove Mark official

Tasks

Add Remove

Datasets

Results from the Paper

Edit

Methods

Add Remove