Near-Optimal Two-Pass Streaming Algorithm for Sampling Random Walks over Directed Graphs

22 Feb 2021  ·  Lijie Chen, Gillat Kol, Dmitry Paramonov, Raghuvansh Saxena, Zhao Song, Huacheng Yu ·

For a directed graph $G$ with $n$ vertices and a start vertex $u_{\sf start}$, we wish to (approximately) sample an $L$-step random walk over $G$ starting from $u_{\sf start}$ with minimum space using an algorithm that only makes few passes over the edges of the graph. This problem found many applications, for instance, in approximating the PageRank of a webpage. If only a single pass is allowed, the space complexity of this problem was shown to be $\tilde{\Theta}(n \cdot L)$. Prior to our work, a better space complexity was only known with $\tilde{O}(\sqrt{L})$ passes. We settle the space complexity of this random walk simulation problem for two-pass streaming algorithms, showing that it is $\tilde{\Theta}(n \cdot \sqrt{L})$, by giving almost matching upper and lower bounds. Our lower bound argument extends to every constant number of passes $p$, and shows that any $p$-pass algorithm for this problem uses $\tilde{\Omega}(n \cdot L^{1/p})$ space. In addition, we show a similar $\tilde{\Theta}(n \cdot \sqrt{L})$ bound on the space complexity of any algorithm (with any number of passes) for the related problem of sampling an $L$-step random walk from every vertex in the graph.

PDF Abstract
No code implementations yet. Submit your code now

Categories


Data Structures and Algorithms Computational Complexity

Datasets


  Add Datasets introduced or used in this paper