Graph Attention Networks

We present graph attention networks (GATs), novel neural network architectures that operate on graph-structured data, leveraging masked self-attentional layers to address the shortcomings of prior methods based on graph convolutions or their approximations. By stacking layers in which nodes are able to attend over their neighborhoods' features, we enable (implicitly) specifying different weights to different nodes in a neighborhood, without requiring any kind of costly matrix operation (such as inversion) or depending on knowing the graph structure upfront. In this way, we address several key challenges of spectral-based graph neural networks simultaneously, and make our model readily applicable to inductive as well as transductive problems. Our GAT models have achieved or matched state-of-the-art results across four established transductive and inductive graph benchmarks: the Cora, Citeseer and Pubmed citation network datasets, as well as a protein-protein interaction dataset (wherein test graphs remain unseen during training).

PDF Abstract ICLR 2018 PDF ICLR 2018 Abstract

Results from the Paper


 Ranked #1 on Node Classification on Pubmed (Validation metric)

     Get a GitHub badge
Task Dataset Model Metric Name Metric Value Global Rank Result Benchmark
Node Classification Chameleon (60%/20%/20% random splits) GAT 1:1 Accuracy 63.9 ± 0.46 # 19
Node Classification on Non-Homophilic (Heterophilic) Graphs Chameleon(60%/20%/20% random splits) GAT 1:1 Accuracy 63.9 ± 0.46 # 18
Graph Classification CIFAR10 100k GAT Accuracy (%) 65.48 # 12
Node Classification Citeseer GAT Accuracy 72.5 ± 0.7% # 44
Training Split fixed 20 per node # 3
Validation YES # 1
Node Classification CiteSeer (0.5%) GAT Accuracy 38.2% # 13
Node Classification CiteSeer (1%) GAT Accuracy 46.5% # 14
Node Classification CiteSeer (60%/20%/20% random splits) GAT 1:1 Accuracy 67.20 ± 0.46 # 31
Node Classification CiteSeer with Public Split: fixed 20 nodes per class GAT Accuracy 72.5 ± 0.7% # 26
Document Classification Cora GAT Accuracy 83.0% # 3
Node Classification Cora GAT Accuracy 83.0% ± 0.7% # 45
Training Split fixed 20 per node # 3
Validation YES # 1
Node Classification Cora (0.5%) GAT Accuracy 41.4% # 13
Node Classification Cora (1%) GAT Accuracy 48.6% # 14
Node Classification Cora (3%) GAT Accuracy 56.8% # 15
Node Classification Cora (60%/20%/20% random splits) GAT 1:1 Accuracy 76.70 ± 0.42 # 30
Node Classification Cora with Public Split: fixed 20 nodes per class GAT Accuracy 83.0 ± 0.7% # 21
Node Classification Cornell (60%/20%/20% random splits) GAT 1:1 Accuracy 76.00 ± 1.01 # 26
Node Classification on Non-Homophilic (Heterophilic) Graphs Cornell (60%/20%/20% random splits) GAT 1:1 Accuracy 76.00 ± 1.01 # 26
Node Classification on Non-Homophilic (Heterophilic) Graphs Deezer-Europe GAT 1:1 Accuracy 61.09±0.77 # 22
Node Classification Film (60%/20%/20% random splits) GAT 1:1 Accuracy 35.98 ± 0.23 # 26
Node Classification Flickr GAT (Velickovic et al., 2018) Accuracy 0.359 # 8
Node Classification genius GAT Accuracy 55.80 ± 0.87 # 25
Skeleton Based Action Recognition J-HMBD Early Action GAT 10% 58.1 # 2
Graph Regression Lipophilicity GAT RMSE 0.95 # 1
Graph Property Prediction ogbg-code2 GAT Test F1 score 0.1569 ± 0.0010 # 15
Validation F1 score 0.1442 ± 0.0017 # 14
Number of params 11030210 # 13
Ext. data No # 1
Node Property Prediction ogbn-arxiv GAT+label+reuse+topo loss Test Accuracy 0.7399 ± 0.0012 # 29
Validation Accuracy 0.7513 ± 0.0009 # 29
Number of params 1441580 # 32
Ext. data No # 1
Node Property Prediction ogbn-arxiv GAT+label reuse+self KD Test Accuracy 0.7416 ± 0.0008 # 23
Validation Accuracy 0.7514 ± 0.0004 # 28
Number of params 1441580 # 32
Ext. data No # 1
Node Property Prediction ogbn-products GAT with NeighborSampling Test Accuracy 0.7945 ± 0.0059 # 46
Validation Accuracy Please tell us # 59
Number of params 751574 # 34
Ext. data No # 1
Node Property Prediction ogbn-proteins GAT + labels + node2vec Test ROC-AUC 0.8711 ± 0.0007 # 8
Validation ROC-AUC 0.9217 ± 0.0011 # 9
Number of params 6360470 # 8
Ext. data No # 1
Node Classification on Non-Homophilic (Heterophilic) Graphs Penn94 GAT 1:1 Accuracy 81.53 ± 0.55 # 14
Node Classification Penn94 GAT Accuracy 81.53 ± 0.55 # 14
Node Classification PPI GAT F1 97.3 # 16
Node Classification on Non-Homophilic (Heterophilic) Graphs Pubmed GAT F1-Score 59.89 ± 4.12 # 1
NMI 55.80 ± 0.87 # 1
Node Classification Pubmed GAT Training Split fixed 20 per node # 4
Validation YES # 1
F1-Score 79.0 # 1
Node Classification PubMed (0.03%) GAT Accuracy 50.9% # 12
Node Classification PubMed (0.05%) GAT Accuracy 50.4% # 13
Node Classification PubMed (0.1%) GAT Accuracy 59.6% # 13
Node Classification PubMed (60%/20%/20% random splits) GAT 1:1 Accuracy 83.28 ± 0.12 # 35
Node Classification PubMed with Public Split: fixed 20 nodes per class GAT Accuracy 79.0% # 22
Node Classification Squirrel (60%/20%/20% random splits) GAT 1:1 Accuracy 42.72 ± 0.33 # 23
Node Classification Texas (60%/20%/20% random splits) GAT 1:1 Accuracy 78.87 ± 0.86 # 32
Node Classification on Non-Homophilic (Heterophilic) Graphs Texas(60%/20%/20% random splits) GAT 1:1 Accuracy 78.87 ± 0.86 # 29
Node Classification Wisconsin (60%/20%/20% random splits) GAT 1:1 Accuracy 71.01 ± 4.66 # 29
Node Classification on Non-Homophilic (Heterophilic) Graphs Wisconsin(60%/20%/20% random splits) GAT 1:1 Accuracy 71.01 ± 4.66 # 26
Graph Regression ZINC 100k GAT MAE 0.463 # 8

Results from Other Papers


Task Dataset Model Metric Name Metric Value Rank Source Paper Compare
Node Classification Brazil Air-Traffic GAT (Velickovic et al., 2018) Accuracy 0.382 # 7
Node Classification Europe Air-Traffic GAT (Velickovic et al., 2018) Accuracy 42.4 # 5
Node Classification USA Air-Traffic GAT (Velickovic et al., 2018) Accuracy 58.5 # 4
Node Classification Wiki-Vote GAT (Velickovic et al., 2018) Accuracy 59.4 # 2
Node Classification PATTERN 100k GAT Accuracy (%) 75.824 # 8

Methods