Finding Stable Groups of Cross-Correlated Features in Two Data Sets With Common Samples

Data sets in which measurements of different types are obtained from a common set of samples appear in many scientific applications. In the analysis of such data, an important problem is to identify groups of features from different data types that are strongly associated. Given two data types, a bimodule is a pair $(A,B)$ of feature sets from the two types such that the aggregate cross-correlation between the features in $A$ and those in $B$ is large. A bimodule $(A,B)$ is stable if $A$ coincides with the set of features that have significant aggregate correlation with the features in $B$, and vice-versa. We develop an, iterative, testing-based procedure called BSP to identify stable bimodules. BSP relies on approximate p-values derived from the permutation moments of sums of squared sample correlations between a single feature of one type and a group of features of the second type. We carry out a thorough simulation study to assess the performance of BSP, and present an extended application to the problem of expression quantitative trait loci (eQTL) analysis using recent data from the GTEx project. In addition, we apply BSP to climatology data to identify regions in North America where annual temperature variation affects precipitation.

Results in Papers With Code
(↓ scroll down to see all results)