NodeSketch: Highly-Efficient Graph Embeddings via Recursive Sketching

Embeddings have become a key paradigm to learn graph represen-tations and facilitate downstream graph analysis tasks. Existinggraph embedding techniques either sample a large number of nodepairs from a graph to learn node embeddings via stochastic op-timization, or factorize a high-order proximity/adjacency matrixof the graph via expensive matrix factorization. However, thesetechniques usually require significant computational resources forthe learning process, which hinders their applications on large-scale graphs. Moreover, the cosine similarity preserved by thesetechniques shows suboptimal efficiency in downstream graph anal-ysis tasks, compared to Hamming similarity, for example. To ad-dress these issues, we propose NodeSketch, a highly-efficient graphembedding technique preserving high-order node proximity viarecursive sketching. Specifically, built on top of an efficient data-independent hashing/sketching technique, NodeSketch generatesnode embeddings in Hamming space. For an input graph, it starts bysketching the self-loop-augmented adjacency matrix of the graphto output low-order node embeddings, and then recursively gener-atesk-order node embeddings based on the self-loop-augmentedadjacency matrix and (k-1)-order node embeddings. Our extensiveevaluation compares NodeSketch against a sizable collection ofstate-of-the-art techniques using five real-world graphs on twograph analysis tasks. The results show that NodeSketch achievesstate-of-the-art performance compared to these techniques, whileshowing significant speedup of 9x-372x in the embedding learningprocess and 1.19x-1.68x speedup when performing downstreamgraph analysis tasks

PDF

Datasets


  Add Datasets introduced or used in this paper

Results from the Paper


  Submit results from this paper to get state-of-the-art GitHub badges and help the community compare results to other papers.

Methods


No methods listed for this paper. Add relevant methods here