A Precision Environment-Wide Association Study of Hypertension via Supervised Cadre Models

14 Aug 2018  ·  Alexander New, Kristin P. Bennett ·

We consider the problem in precision health of grouping people into subpopulations based on their degree of vulnerability to a risk factor. These subpopulations cannot be discovered with traditional clustering techniques because their quality is evaluated with a supervised metric: the ease of modeling a response variable over observations within them. Instead, we apply the supervised cadre model (SCM), which does use this metric. We extend the SCM formalism so that it may be applied to multivariate regression and binary classification problems. We also develop a way to use conditional entropy to assess the confidence in the process by which a subject is assigned their cadre. Using the SCM, we generalize the environment-wide association study (EWAS) workflow to be able to model heterogeneity in population risk. In our EWAS, we consider more than two hundred environmental exposure factors and find their association with diastolic blood pressure, systolic blood pressure, and hypertension. This requires adapting the SCM to be applicable to data generated by a complex survey design. After correcting for false positives, we found 25 exposure variables that had a significant association with at least one of our response variables. Eight of these were significant for a discovered subpopulation but not for the overall population. Some of these associations have been identified by previous researchers, while others appear to be novel. We examine several discovered subpopulations in detail, and we find that they are interpretable and that they suggest further research questions.

PDF Abstract


  Add Datasets introduced or used in this paper

Results from the Paper

  Submit results from this paper to get state-of-the-art GitHub badges and help the community compare results to other papers.


No methods listed for this paper. Add relevant methods here