Machine learning models for early sepsis recognition in the neonatal intensive care unit using readily available electronic health record data

PLOS ONE 2019 · Aaron J. Masino, Mary Catherine Harris, Daniel Forsyth, Svetlana Ostapenko, Lakshmi Srinivasan, Christopher P. Bonafide, Fran Balamuth, Melissa Schmatz, Robert W. Grundmeier ·

Background Rapid antibiotic administration is known to improve sepsis outcomes, however early diagnosis remains challenging due to complex presentation. Our objective was to develop a model using readily available electronic health record (EHR) data capable of recognizing infant sepsis at least 4 hours prior to clinical recognition. Methods and findings We performed a retrospective case control study of infants hospitalized ≥48 hours in the Neonatal Intensive Care Unit (NICU) at the Children’s Hospital of Philadelphia between September 2014 and November 2017 who received at least one sepsis evaluation before 12 months of age. We considered two evaluation outcomes as cases: culture positive–positive blood culture for a known pathogen (110 evaluations); and clinically positive–negative cultures but antibiotics administered for ≥120 hours (265 evaluations). Case data was taken from the 44-hour window ending 4 hours prior to evaluation. We randomly sampled 1,100 44-hour windows of control data from all times ≥10 days removed from any evaluation. Model inputs consisted of up to 36 features derived from routine EHR data. Using 10-fold nested cross-validation, 8 machine learning models were trained to classify inputs as sepsis positive or negative. When tasked with discriminating culture positive cases from controls, 6 models achieved a mean area under the receiver operating characteristic (AUC) between 0.80–0.82 with no significant differences between them. Including both culture and clinically positive cases, the same 6 models achieved an AUC between 0.85–0.87, again with no significant differences. Conclusions Machine learning models can identify infants with sepsis in the NICU hours prior to clinical recognition. Learning curves indicate model improvement may be achieved with additional training examples. Additional input features may also improve performance. Further research is warranted to assess potential performance improvements and clinical efficacy in a prospective trial.

PDF Abstract