Exact $p$-values for global network alignments via combinatorial analysis of shared GO terms (Subtitle: REFANGO: Rigorous Evaluation of Functional Alignments of Networks using Gene Ontology)

9 Oct 2020 · Wayne B. Hayes ·

Network alignment aims to uncover topologically similar regions in the protein-protein interaction (PPI) networks of two or more species under the assumption that topologically similar regions tend to perform similar functions. Although there exist a plethora of both network alignment algorithms and measures of topological similarity, currently no gold standard exists for evaluating how well either is able to uncover functionally similar regions. Here we propose a formal, mathematically and statistically rigorous method for evaluating the statistical significance of shared GO terms in a global, 1-to-1 alignment between two PPI networks. We use combinatorics to precisely count the number of possible network alignments in which $k$ proteins share a particular GO term. When divided by the number of all possible network alignments, this provides an explicit, exact $p$-value for a network alignment with respect to a particular GO term. Just as with BLAST's p-values and bit-scores, this method is designed not to guide the formation of any particular alignment, but instead to provide an after-the-fact evaluation of a fixed, given alignment.

PDF Abstract