Learning Graph-Level Representation for Drug Discovery

12 Sep 2017  Â·  Junying Li, Deng Cai, Xiaofei He ·

Predicating macroscopic influences of drugs on human body, like efficacy and toxicity, is a central problem of small-molecule based drug discovery. Molecules can be represented as an undirected graph, and we can utilize graph convolution networks to predication molecular properties. However, graph convolutional networks and other graph neural networks all focus on learning node-level representation rather than graph-level representation. Previous works simply sum all feature vectors for all nodes in the graph to obtain the graph feature vector for drug predication. In this paper, we introduce a dummy super node that is connected with all nodes in the graph by a directed edge as the representation of the graph and modify the graph operation to help the dummy super node learn graph-level feature. Thus, we can handle graph-level classification and regression in the same way as node-level classification and regression. In addition, we apply focal loss to address class imbalance in drug datasets. The experiments on MoleculeNet show that our method can effectively improve the performance of molecular properties predication.

PDF Abstract

Results from the Paper


Task Dataset Model Metric Name Metric Value Global Rank Result Benchmark
Drug Discovery HIV dataset GraphConv + dummy super node + focal loss AUC 0.851 # 1
Drug Discovery MUV GraphConv + dummy super node AUC 0.845 # 2
Drug Discovery PCBA GraphConv + dummy super node AUC 0.867 # 1
Drug Discovery Tox21 GraphConv + dummy super node AUC 0.854 # 6
Drug Discovery ToxCast GraphConv + dummy super node AUC 0.768 # 2

Methods