Forecasting the 2017-2018 Yemen Cholera Outbreak with Machine Learning

The ongoing Yemen cholera outbreak has been deemed one of the worst cholera outbreaks in history, with over a million people impacted and thousands dead. Triggered by a civil war, the outbreak has been shaped by various political, environmental, and epidemiological factors and continues to worsen. While cholera has several effective treatments, the untimely and inefficient distribution of existing medicines has been the primary cause of cholera mortality. With the hope of facilitating resource allocation, various mathematical models have been created to track the Yemeni outbreak and identify at-risk administrative divisions, called governorates. Existing models are not powerful enough to accurately and consistently forecast cholera cases per governorate over multiple timeframes. To address the need for a complex, reliable model, we offer the Cholera Artificial Learning Model (CALM); a system of 4 extreme-gradient-boosting (XGBoost) machine learning models that forecast the number of new cholera cases a Yemeni governorate will experience from a time range of 2 weeks to 2 months. CALM provides a novel machine learning approach that makes use of rainfall data, past cholera cases and deaths data, civil war fatalities, and inter-governorate interactions represented across multiple time frames. Additionally, the use of machine learning, along with extensive feature engineering, allows CALM to easily learn complex non-linear relations apparent in an epidemiological phenomenon. CALM is able to forecast cholera incidence 2 weeks to 2 months in advance within a margin of just 5 cholera cases per 10,000 people in real-world simulation.

PDF Abstract

Datasets


  Add Datasets introduced or used in this paper

Results from the Paper


  Submit results from this paper to get state-of-the-art GitHub badges and help the community compare results to other papers.

Methods


No methods listed for this paper. Add relevant methods here