High-dimensional confounding adjustment using continuous spike and slab priors

25 Apr 2017  ·  Joseph Antonelli, Giovanni Parmigiani, Francesca Dominici ·

In observational studies, estimation of a causal effect of a treatment on an outcome relies on proper adjustment for confounding. If the number of the potential confounders ($p$) is larger than the number of observations ($n$), then direct control for all potential confounders is infeasible. Existing approaches for dimension reduction and penalization are generally aimed at predicting the outcome, and are less suited for estimation of causal effects. Under standard penalization approaches (e.g. Lasso), if a variable $X_j$ is strongly associated with the treatment $T$ but weakly with the outcome $Y$, the coefficient $\beta_j$ will be shrunk towards zero thus leading to confounding bias. Under the assumption of a linear model for the outcome and sparsity, we propose continuous spike and slab priors on the regression coefficients $\beta_j$ corresponding to the potential confounders $X_j$. Specifically, we introduce a prior distribution that does not heavily shrink to zero the coefficients ($\beta_j$s) of the $X_j$s that are strongly associated with $T$ but weakly associated with $Y$. We compare our proposed approach to several state of the art methods proposed in the literature. Our proposed approach has the following features: 1) it reduces confounding bias in high dimensional settings; 2) it shrinks towards zero coefficients of instrumental variables; and 3) it achieves good coverages even in small sample sizes. We apply our approach to the National Health and Nutrition Examination Survey (NHANES) data to estimate the causal effects of persistent pesticide exposure on triglyceride levels.

PDF Abstract

Datasets


  Add Datasets introduced or used in this paper