Generalized Hypergeometric Ensembles: Statistical Hypothesis Testing in Complex Networks

8 Jul 2016  ·  Giona Casiraghi, Vahan Nanumyan, Ingo Scholtes, Frank Schweitzer ·

Statistical ensembles of networks, i.e., probability spaces of all networks that are consistent with given aggregate statistics, have become instrumental in the analysis of complex networks. Their numerical and analytical study provides the foundation for the inference of topological patterns, the definition of network-analytic measures, as well as for model selection and statistical hypothesis testing. Contributing to the foundation of these data analysis techniques, in this Letter we introduce generalized hypergeometric ensembles, a broad class of analytically tractable statistical ensembles of finite, directed and weighted networks. This framework can be interpreted as a generalization of the classical configuration model, which is commonly used to randomly generate networks with a given degree sequence or distribution. Our generalization rests on the introduction of dyadic link propensities, which capture the degree-corrected tendencies of pairs of nodes to form edges between each other. Studying empirical and synthetic data, we show that our approach provides broad perspectives for model selection and statistical hypothesis testing in data on complex networks.

PDF Abstract

Categories


Physics and Society Social and Information Networks Combinatorics Data Analysis, Statistics and Probability 05C82 (Primary), 62H15 (Secondary)