Finding Stable Groups of Cross-Correlated Features in Two Data Sets With Common Samples

10 Sep 2020  ·  Miheer Dewaskar, John Palowitch, Mark He, Michael I. Love, Andrew B. Nobel ·

Data sets in which measurements of different types are obtained from a common set of samples appear in many scientific applications. In the analysis of such data, an important problem is to identify groups of features from different data types that are strongly associated. Given two data types, a bimodule is a pair $(A,B)$ of feature sets from the two types such that the aggregate cross-correlation between the features in $A$ and those in $B$ is large. A bimodule $(A,B)$ is stable if $A$ coincides with the set of features that have significant aggregate correlation with the features in $B$, and vice-versa. We develop an, iterative, testing-based procedure called BSP to identify stable bimodules. BSP relies on approximate p-values derived from the permutation moments of sums of squared sample correlations between a single feature of one type and a group of features of the second type. We carry out a thorough simulation study to assess the performance of BSP, and present an extended application to the problem of expression quantitative trait loci (eQTL) analysis using recent data from the GTEx project. In addition, we apply BSP to climatology data to identify regions in North America where annual temperature variation affects precipitation.

PDF Abstract
No code implementations yet. Submit your code now



  Add Datasets introduced or used in this paper

Results from the Paper

  Submit results from this paper to get state-of-the-art GitHub badges and help the community compare results to other papers.


No methods listed for this paper. Add relevant methods here