A Fair Comparison of Graph Neural Networks for Graph Classification

ICLR 2020  ยท  Federico Errica, Marco Podda, Davide Bacciu, Alessio Micheli ยท

Experimental reproducibility and replicability are critical topics in machine learning. Authors have often raised concerns about their lack in scientific publications to improve the quality of the field. Recently, the graph representation learning field has attracted the attention of a wide research community, which resulted in a large stream of works. As such, several Graph Neural Network models have been developed to effectively tackle graph classification. However, experimental procedures often lack rigorousness and are hardly reproducible. Motivated by this, we provide an overview of common practices that should be avoided to fairly compare with the state of the art. To counter this troubling trend, we ran more than 47000 experiments in a controlled and uniform framework to re-evaluate five popular models across nine common benchmarks. Moreover, by comparing GNNs with structure-agnostic baselines we provide convincing evidence that, on some datasets, structural information has not been exploited yet. We believe that this work can contribute to the development of the graph learning field, by providing a much needed grounding for rigorous evaluations of graph classification models.

PDF Abstract ICLR 2020 PDF ICLR 2020 Abstract
Task Dataset Model Metric Name Metric Value Global Rank Result Benchmark
Graph Classification COLLAB GraphSAGE Accuracy 73.9% # 23
Graph Classification D&D DGCNN Accuracy 76.6% # 30
Graph Classification ENZYMES GraphSAGE Accuracy 58.2% # 25
Graph Classification ENZYMES GIN Accuracy 59.6% # 21
Graph Classification IMDb-B GraphSAGE Accuracy 68.8% # 37
Graph Classification IMDb-M GraphSAGE Accuracy 47.6% # 31
Graph Classification NCI1 GIN Accuracy 80% # 30
Graph Classification NCI1 DGCNN Accuracy 76.4% # 35
Graph Classification PROTEINS GraphSAGE Accuracy 73% # 77
Graph Classification PROTEINS DiffPool Accuracy 73.7% # 71
Graph Classification REDDIT-B GraphSAGE Accuracy 84.3 # 8
Graph Classification REDDIT-MULTI-5k GraphSAGE Accuracy 50 # 1

Methods