MolecularRNN: Generating realistic molecular graphs with optimized properties

31 May 2019  ·  Mariya Popova, Mykhailo Shvets, Junier Oliva, Olexandr Isayev ·

Designing new molecules with a set of predefined properties is a core problem in modern drug discovery and development. There is a growing need for de-novo design methods that would address this problem. We present MolecularRNN, the graph recurrent generative model for molecular structures. Our model generates diverse realistic molecular graphs after likelihood pretraining on a big database of molecules. We perform an analysis of our pretrained models on large-scale generated datasets of 1 million samples. Further, the model is tuned with policy gradient algorithm, provided a critic that estimates the reward for the property of interest. We show a significant distribution shift to the desired range for lipophilicity, drug-likeness, and melting point outperforming state-of-the-art works. With the use of rejection sampling based on valency constraints, our model yields 100% validity. Moreover, we show that invalid molecules provide a rich signal to the model through the use of structure penalty in our reinforcement learning pipeline.

PDF Abstract

Datasets


Results from the Paper


 Ranked #1 on Molecular Graph Generation on ZINC (QED Top-3 metric)

     Get a GitHub badge
Task Dataset Model Metric Name Metric Value Global Rank Result Benchmark
Molecular Graph Generation MOSES MolecularRNN Validity 1.0 # 1
Molecular Graph Generation ZINC MRNN QED Top-3 0.844, 0.796, 0.736 # 1
PlogP Top-3 8.63, 6.08, 4.73 # 1

Methods


No methods listed for this paper. Add relevant methods here