Combining Label Propagation and Simple Models Out-performs Graph Neural Networks

Graph Neural Networks (GNNs) are the predominant technique for learning over graphs. However, there is relatively little understanding of why GNNs are successful in practice and whether they are necessary for good performance. Here, we show that for many standard transductive node classification benchmarks, we can exceed or match the performance of state-of-the-art GNNs by combining shallow models that ignore the graph structure with two simple post-processing steps that exploit correlation in the label structure: (i) an "error correlation" that spreads residual errors in training data to correct errors in test data and (ii) a "prediction correlation" that smooths the predictions on the test data. We call this overall procedure Correct and Smooth (C&S), and the post-processing steps are implemented via simple modifications to standard label propagation techniques from early graph-based semi-supervised learning methods. Our approach exceeds or nearly matches the performance of state-of-the-art GNNs on a wide variety of benchmarks, with just a small fraction of the parameters and orders of magnitude faster runtime. For instance, we exceed the best known GNN performance on the OGB-Products dataset with 137 times fewer parameters and greater than 100 times less training time. The performance of our methods highlights how directly incorporating label information into the learning algorithm (as was done in traditional techniques) yields easy and substantial performance gains. We can also incorporate our techniques into big GNN models, providing modest gains. Our code for the OGB results is at https://github.com/Chillee/CorrectAndSmooth.

PDF Abstract ICLR 2021 PDF ICLR 2021 Abstract
Task Dataset Model Metric Name Metric Value Global Rank Result Benchmark
Node Classification on Non-Homophilic (Heterophilic) Graphs Deezer-Europe C&S(2hop) 1:1 Accuracy 64.52±0.62 # 20
Node Classification on Non-Homophilic (Heterophilic) Graphs Deezer-Europe C&S(1hop) 1:1 Accuracy 64.60±0.57 # 19
Node Classification genius C&S 1-hop  Accuracy 82.93 ± 0.15 # 18
Node Classification on Non-Homophilic (Heterophilic) Graphs genius C&S 2-hop 1:1 Accuracy 84.94 ± 0.49 # 19
Node Classification on Non-Homophilic (Heterophilic) Graphs genius C&S 1-hop  1:1 Accuracy 82.93 ± 0.15 # 20
Node Classification genius C&S 2-hop Accuracy 84.94 ± 0.49 # 17
Node Property Prediction ogbn-arxiv GAT + C&S Test Accuracy 0.7386 ± 0.0014 # 34
Validation Accuracy 0.7484 ± 0.0007 # 38
Number of params 1567000 # 24
Ext. data No # 1
Node Property Prediction ogbn-arxiv Plain Linear + C&S Test Accuracy 0.7126 ± 0.0001 # 72
Validation Accuracy 0.7300 ± 0.0001 # 66
Number of params 5160 # 75
Ext. data No # 1
Node Property Prediction ogbn-arxiv GAT(norm.adj.)+label reuse+C&S Test Accuracy 0.7395 ± 0.0012 # 32
Validation Accuracy 0.7519 ± 0.0008 # 24
Number of params 1441580 # 32
Ext. data No # 1
Node Property Prediction ogbn-arxiv GCN_res + C&S_v2 Test Accuracy 0.7313 ± 0.0017 # 44
Validation Accuracy 0.7445 ± 0.0011 # 46
Number of params 155824 # 58
Ext. data No # 1
Node Property Prediction ogbn-arxiv MLP + C&S Test Accuracy 0.7312 ± 0.0012 # 45
Validation Accuracy 0.7391 ± 0.0015 # 51
Number of params 175656 # 57
Ext. data No # 1
Node Property Prediction ogbn-arxiv GCN_res + C&S Test Accuracy 0.7297 ± 0.0022 # 49
Validation Accuracy 0.7423 ± 0.0014 # 49
Number of params 155824 # 58
Ext. data No # 1
Node Property Prediction ogbn-arxiv Linear + C&S Test Accuracy 0.7222 ± 0.0002 # 58
Validation Accuracy 0.7368 ± 0.0004 # 56
Number of params 15400 # 74
Ext. data No # 1
Node Property Prediction ogbn-products Linear + C&S Test Accuracy 0.8301 ± 0.0001 # 30
Validation Accuracy 0.9134 ± 0.0001 # 48
Number of params 10763 # 57
Ext. data No # 1
Node Property Prediction ogbn-products Spec-MLP-Wide + C&S Test Accuracy 0.8451 ± 0.0006 # 21
Validation Accuracy 0.9132 ± 0.0010 # 49
Number of params 406063 # 39
Ext. data No # 1
Node Property Prediction ogbn-products GraphSAGE w/NS + C&S Test Accuracy 0.8041 ± 0.0022 # 43
Validation Accuracy 0.9238 ± 0.0007 # 32
Number of params 207919 # 44
Ext. data No # 1
Node Property Prediction ogbn-products GAT w/NS + C&S Test Accuracy 0.8092 ± 0.0037 # 40
Validation Accuracy 0.9263 ± 0.0008 # 28
Number of params 753622 # 33
Ext. data No # 1
Node Property Prediction ogbn-products Plain Linear + C&S Test Accuracy 0.8254 ± 0.0003 # 32
Validation Accuracy 0.9103 ± 0.0001 # 50
Number of params 4747 # 58
Ext. data No # 1
Node Property Prediction ogbn-products MLP + C&S Test Accuracy 0.8418 ± 0.0007 # 24
Validation Accuracy 0.9147 ± 0.0009 # 47
Number of params 96247 # 55
Ext. data No # 1
Node Classification Penn94 C&S 1-hop  Accuracy 74.28 ± 1.19 # 23
Node Classification on Non-Homophilic (Heterophilic) Graphs Penn94 C&S 1-hop  1:1 Accuracy 74.28 ± 1.19 # 24
Node Classification on Non-Homophilic (Heterophilic) Graphs Penn94 C&S 2-hop 1:1 Accuracy 78.40 ± 3.12 # 20
Node Classification Penn94 C&S 2-hop Accuracy 78.40 ± 3.12 # 19
Node Classification on Non-Homophilic (Heterophilic) Graphs twitch-gamers C&S 2-hop 1:1 Accuracy 65.02 ± 0.16 # 11
Node Classification on Non-Homophilic (Heterophilic) Graphs twitch-gamers C&S 1-hop  1:1 Accuracy 64.86 ± 0.27 # 12

Methods


No methods listed for this paper. Add relevant methods here