Low Data Drug Discovery with One-shot Learning

10 Nov 2016  ·  Han Altae-Tran, Bharath Ramsundar, Aneesh S. Pappu, Vijay Pande ·

Recent advances in machine learning have made significant contributions to drug discovery. Deep neural networks in particular have been demonstrated to provide significant boosts in predictive power when inferring the properties and activities of small-molecule compounds. However, the applicability of these techniques has been limited by the requirement for large amounts of training data. In this work, we demonstrate how one-shot learning can be used to significantly lower the amounts of data required to make meaningful predictions in drug discovery applications. We introduce a new architecture, the residual LSTM embedding, that, when combined with graph convolutional neural networks, significantly improves the ability to learn meaningful distance metrics over small-molecules. We open source all models introduced in this work as part of DeepChem, an open-source framework for deep-learning in drug discovery.

PDF Abstract

Results from the Paper


Task Dataset Model Metric Name Metric Value Global Rank Benchmark
Molecular Property Prediction MUV IterRefLSTM ROC-AUC 67.00 # 3
Molecular Property Prediction SIDER IterRefLSTM ROC-AUC 70.40 # 3
Molecular Property Prediction Tox21 IterRefLSTM ROC-AUC 83.00 # 2

Methods


No methods listed for this paper. Add relevant methods here