Exploiting structured data for learning contagious diseases under incomplete testing

1 Jan 2021  ·  Maggie Makar, Lauren West, David Hooper, Eric Horvitz, Erica Shenoy, John Guttag ·

One of the ways that machine learning algorithms can help control the spread of an infectious disease is by building models that predict who is likely to get infected whether or not they display any symptoms, making them good candidates for preemptive isolation. In this work we ask: can we build reliable infection prediction models when the observed data is collected under limited, and biased testing that prioritizes testing symptomatic individuals? Our analysis suggests that under favorable conditions, incomplete testing might be sufficient to achieve relatively good out-of-sample prediction error. Favorable conditions occur when untested-infected individuals have sufficiently different characteristics from untested-healthy, and when the infected individuals are "potent", meaning they infect a large majority of their neighbors. We develop an algorithm that predicts infections, and show that it outperforms benchmarks on simulated data. We apply our model to data from a large hospital to predict Clostridioides difficile infections; a communicable disease that is characterized by asymptomatic (i.e., untested) carriers. Using a proxy instead of the unobserved untested-infected state, we show that our model outperforms benchmarks in predicting infections.

PDF Abstract
No code implementations yet. Submit your code now

Tasks


Datasets


  Add Datasets introduced or used in this paper

Results from the Paper


  Submit results from this paper to get state-of-the-art GitHub badges and help the community compare results to other papers.

Methods


No methods listed for this paper. Add relevant methods here