Machine Learning and Data Science approach towards trend and predictors analysis of CDC Mortality Data for the USA

11 Sep 2020  ·  Yasir Nadeem, Awais Ahmed ·

The research on mortality is an active area of research for any country where the conclusions are driven from the provided data and conditions. The domain knowledge is an essential but not a mandatory skill (though some knowledge is still required) in order to derive conclusions based on data intuition using machine learning and data science practices. The purpose of conducting this project was to derive conclusions based on the statistics from the provided dataset and predict label(s) of the dataset using supervised or unsupervised learning algorithms. The study concluded (based on a sample) life expectancy regardless of gender, and their central tendencies; Marital status of the people also affected how frequent deaths were for each of them. The study also helped in finding out that due to more categorical and numerical data, anomaly detection or under-sampling could be a viable solution since there are possibilities of more class labels than the other(s). The study shows that machine learning predictions aren't as viable for the data as it might be apparent.

PDF Abstract

Datasets


  Add Datasets introduced or used in this paper

Results from the Paper


  Submit results from this paper to get state-of-the-art GitHub badges and help the community compare results to other papers.

Methods


No methods listed for this paper. Add relevant methods here