Empirical Bayes Shrinkage and False Discovery Rate Estimation, Allowing For Unwanted Variation

28 Sep 2017  ·  David Gerard, Matthew Stephens ·

We combine two important ideas in the analysis of large-scale genomics experiments (e.g. experiments that aim to identify genes that are differentially expressed between two conditions). The first is use of Empirical Bayes (EB) methods to handle the large number of potentially-sparse effects, and estimate false discovery rates and related quantities. The second is use of factor analysis methods to deal with sources of unwanted variation such as batch effects and unmeasured confounders. We describe a simple modular fitting procedure that combines key ideas from both these lines of research. This yields new, powerful EB methods for analyzing genomics experiments that account for both sparse effects and unwanted variation. In realistic simulations, these new methods provide significant gains in power and calibration over competing methods. In real data analysis we find that different methods, while often conceptually similar, can vary widely in their assessments of statistical significance. This highlights the need for care in both choice of methods and interpretation of results. All methods introduced in this paper are implemented in the R package vicar available at https://github.com/dcgerard/vicar .

PDF Abstract

Datasets


  Add Datasets introduced or used in this paper