Bayesian Algorithm for Retrosynthesis

J Chem Inf Model. 2020 Oct 26;60(10):4474-4486. doi: 10.1021/acs.jcim.0c00320. Epub 2020 Oct 13.

Abstract

The identification of synthetic routes that end with the desired product is considered an inherently time-consuming process that is largely dependent on expert knowledge regarding a limited proportion of the entire reaction space. At present, emerging machine learning technologies are reformulating the process of retrosynthetic planning. This study aimed to discover synthetic routes backwardly from a given desired molecule to commercially available compounds. The problem is reduced to a combinatorial optimization task with the solution space subject to the combinatorial complexity of all possible pairs of purchasable reactants. We address this issue within the framework of Bayesian inference and computation. The workflow consists of the training of a deep neural network, which is used to forwardly predict a product of the given reactants with a high level of accuracy, followed by inversion of the forward model into the backward one via Bayes' law of conditional probability. Using the backward model, a diverse set of highly probable reaction sequences ending with a given synthetic target is exhaustively explored using a Monte Carlo search algorithm. With a forward model prediction accuracy of approximately 87%, the Bayesian retrosynthesis algorithm successfully rediscovered 81.8 and 33.3% of known synthetic routes of one-step and two-step reactions, respectively, with top-10 accuracy. Remarkably, the Monte Carlo algorithm, which was specifically designed for the presence of multiple diverse routes, often revealed a ranked list of hundreds of reaction routes to the same synthetic target. We also investigated the potential applicability of such diverse candidates based on expert knowledge of synthetic organic chemistry.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Algorithms*
  • Bayes Theorem
  • Machine Learning
  • Monte Carlo Method
  • Neural Networks, Computer*