Learning Retrosynthetic Planning through Simulated Experience

John S Schreck; Connor W Coley; Kyle J M Bishop

doi:10.1021/acscentsci.9b00055

Learning Retrosynthetic Planning through Simulated Experience

ACS Cent Sci. 2019 Jun 26;5(6):970-981. doi: 10.1021/acscentsci.9b00055. Epub 2019 May 31.

Authors

John S Schreck¹, Connor W Coley², Kyle J M Bishop¹

Affiliations

¹ Department of Chemical Engineering, Columbia University, New York, New York 10027, United States.
² Department of Chemical Engineering, Massachusetts Institute of Technology, Cambridge, Massachusetts 02139, United States.

Abstract

The problem of retrosynthetic planning can be framed as a one-player game, in which the chemist (or a computer program) works backward from a molecular target to simpler starting materials through a series of choices regarding which reactions to perform. This game is challenging as the combinatorial space of possible choices is astronomical, and the value of each choice remains uncertain until the synthesis plan is completed and its cost evaluated. Here, we address this search problem using deep reinforcement learning to identify policies that make (near) optimal reaction choices during each step of retrosynthetic planning according to a user-defined cost metric. Using a simulated experience, we train a neural network to estimate the expected synthesis cost or value of any given molecule based on a representation of its molecular structure. We show that learned policies based on this value network can outperform a heuristic approach that favors symmetric disconnections when synthesizing unfamiliar molecules from available starting materials using the fewest number of reactions. We discuss how the learned policies described here can be incorporated into existing synthesis planning tools and how they can be adapted to changes in the synthesis cost objective or material availability.

Grants and funding

G20 RR030893/RR/NCRR NIH HHS/United States