BCDR: Betweenness Centrality-based Distance Resampling for Graph Shortest Distance Embedding

29 Sep 2021  ·  Haoyu Wang, Chun Yuan ·

Along with unprecedented development in network analysis such as biomedical structure prediction and social relationship analysis, Shortest Distance Queries (SDQs) in graphs receive an increasing attention. Approximate algorithms of SDQs with reduced complexity are of vital importance to complex graph applications. Among different approaches, embedding-based distance prediction has made a breakthrough in both efficiency and accuracy, ascribing to the significant performance of Graph Representation Learning (GRL). Embedding-based distance prediction usually leverages truncated random walk followed by Pointwise Mutual Information (PMI)-based optimization to embed local structural features into a dense vector on each node and integrates with a subsequent predictor for global extraction of nodes' mutual shortest distance. It has several shortcomings. Random walk as an unstrained node sequence possesses a limited distance exploration, failing to take into account remote nodes under graph's shortest distance metric, while the PMI-based maximum likelihood optimization of node embeddings reflects excessively versatile local similarity, which incurs an adverse impact on the preservation of the exact shortest distance relation during the mapping from the original graph space to the embedded vector space. To address these shortcomings, we propose in this paper a novel graph shortest distance embedding method called Betweenness Centrality-based Distance Resampling (BCDR). First, we prove in a statistical perspective that Betweenness Centrality(BC)-based random walk can occupy a wider distance range measured by the intrinsic metric in the graph domain due to its awareness of the path structure. Second, we perform Distance Resampling (DR) from original walk paths before maximum likelihood optimization instead of the PMI-based optimization and prove that this strategy preserves distance relation with respect to any calibrated node via steering optimization objective to reconstruct a global distance matrix. Our proposed method possesses a strong theoretical background and shows much better performance than existing methods when evaluated on a broad class of real-world graph datasets with large diameters in SDQ problems. It should also outperform existing methods in other graph structure-related applications.

PDF Abstract

Datasets


  Add Datasets introduced or used in this paper

Results from the Paper


  Submit results from this paper to get state-of-the-art GitHub badges and help the community compare results to other papers.

Methods


No methods listed for this paper. Add relevant methods here