Data Augmentation for Graph Neural Networks

11 Jun 2020  ·  Tong Zhao, Yozen Liu, Leonardo Neves, Oliver Woodford, Meng Jiang, Neil Shah ·

Data augmentation has been widely used to improve generalizability of machine learning models. However, comparatively little work studies data augmentation for graphs. This is largely due to the complex, non-Euclidean structure of graphs, which limits possible manipulation operations. Augmentation operations commonly used in vision and language have no analogs for graphs. Our work studies graph data augmentation for graph neural networks (GNNs) in the context of improving semi-supervised node-classification. We discuss practical and theoretical motivations, considerations and strategies for graph data augmentation. Our work shows that neural edge predictors can effectively encode class-homophilic structure to promote intra-class edges and demote inter-class edges in given graph structure, and our main contribution introduces the GAug graph data augmentation framework, which leverages these insights to improve performance in GNN-based node classification via edge prediction. Extensive experiments on multiple benchmarks show that augmentation via GAug improves performance across GNN architectures and datasets.

PDF Abstract

Results from the Paper

Task Dataset Model Metric Name Metric Value Global Rank Result Benchmark
Node Classification BlogCatalog GCN+GAugM Accuracy 77.6% # 2
Node Classification CiteSeer with Public Split: fixed 20 nodes per class GCN+GAugO Accuracy 73.3 ± 1.1 # 12
Node Classification Cora with Public Split: fixed 20 nodes per class GCN+GAugO Accuracy 83.6 ± 0.5 # 13
Node Classification Flickr GCN+GAugM (Zhao et al., 2021) Accuracy 0.682 # 1


No methods listed for this paper. Add relevant methods here