RetroPrime: A Diverse, plausible and Transformer-based method for Single-Step retrosynthesis predictions

Retrosynthesis prediction is a crucial task for organic synthesis. In this work, we propose a single-step template-free and Transformer-based method dubbed RetroPrime, integrating chemists’ retrosynthetic strategy of (1) decomposing a molecule into synthons then (2) generating reactants by attaching leaving groups. These two stages are accomplished with versatile Transformer models, respectively. RetroPrime achieves the Top-1 accuracy of 64.8% and 51.4%, when the reaction type is known and unknown, respectively, in the USPTO-50 K dataset. And the Top-1 accuracy is close to the state-of-the-art transformer-based method in the large dataset USPTO-full. It is known that outputs of the Transformer-based retrosynthesis model tend to suffer from insufficient diversity and high chemical implausibility. These problems may limit the potential of Transformer-based methods in real practice, yet few works address both issues simultaneously. RetroPrime is designed to tackle these challenges.

PDF Abstract

Datasets


Results from the Paper


Task Dataset Model Metric Name Metric Value Global Rank Benchmark
Single-step retrosynthesis USPTO-50k RetroPrime Top-1 accuracy 51.4 # 10
Top-3 accuracy 70.8 # 4
Top-5 accuracy 74.0 # 7
Top-10 accuracy 76.1 # 7

Methods