Joint Use of Node Attributes and Proximity for Semi-Supervised Classification on Graphs

22 Oct 2020 · Arpit Merchant, Michael Mathioudakis ·

The task of node classification is to infer unknown node labels, given the labels for some of the nodes along with the network structure and other node attributes. Typically, approaches for this task assume homophily, whereby neighboring nodes have similar attributes and a node's label can be predicted from the labels of its neighbors or other proximate (i.e., nearby) nodes in the network. However, such an assumption may not always hold -- in fact, there are cases where labels are better predicted from the individual attributes of each node rather than the labels of its proximate nodes. Ideally, node classification methods should flexibly adapt to a range of settings wherein unknown labels are predicted either from labels of proximate nodes, or individual node attributes, or partly both. In this paper, we propose a principled approach, JANE, based on a generative probabilistic model that jointly weighs the role of attributes and node proximity via embeddings in predicting labels. Our experiments on a variety of network datasets demonstrate that JANE exhibits the desired combination of versatility and competitive performance compared to standard baselines.

PDF Abstract