metapath2vec: Scalable Representation Learning for Heterogeneous Networks

We study the problem of representation learning in heterogeneous networks. Its unique challenges come from the existence of multiple types of nodes and links, which limit the feasibility of the conventional network embedding techniques. We develop two scalable representation learning models, namely metapath2vec and metapath2vec++. The metapath2vec model formalizes meta-path-based random walks to construct the heterogeneous neighborhood of a node and then leverages a heterogeneous skip-gram model to perform node embeddings. The metapath2vec++ model further enables the simultaneous modeling of structural and semantic correlations in heterogeneous networks. Extensive experiments show that metapath2vec and metapath2vec++ are able to not only outperform state-of-the-art embedding models in various heterogeneous network mining tasks, such as node classification, clustering, and similarity search, but also discern the structural and semantic correlations between diverse network objects.

PDF Abstract

Results from the Paper


Task Dataset Model Metric Name Metric Value Global Rank Benchmark
Link Prediction MovieLens 25M metapath2vec nDCG@10 0.5051 # 5
Hits@10 0.7956 # 5
Node Property Prediction ogbn-mag MetaPath2vec Test Accuracy 0.3544 ± 0.0036 # 34
Validation Accuracy 0.3506 ± 0.0017 # 34
Number of params 94479069 # 9
Ext. data No # 1
Link Prediction Yelp Metapath2Vec HR@10 0.6307 # 7
nDCG@10 0.402 # 7

Methods


No methods listed for this paper. Add relevant methods here