Efficient and near-optimal algorithms for sampling small connected subgraphs
We study the following problem: given an integer $k \ge 3$ and a simple graph $G$, sample a connected induced $k$-node subgraph of $G$ uniformly at random. This is a fundamental graph mining primitive with applications in social network analysis, bioinformatics, and more. Surprisingly, no efficient algorithm is known for uniform sampling; the only somewhat efficient algorithms available yield samples that are only approximately uniform, with running times that are unclear or suboptimal. In this work we provide: (i) a near-optimal mixing time bound for a well-known random walk technique, (ii) the first efficient algorithm for truly uniform graphlet sampling, and (iii) the first sublinear-time algorithm for $\epsilon$-uniform graphlet sampling.
PDF Abstract