Open Macromolecular Genome: Generative Design of Synthetically Accessible Polymers

ACS Polym Au. 2023 Mar 29;3(4):318-330. doi: 10.1021/acspolymersau.3c00003. eCollection 2023 Aug 9.

Abstract

A grand challenge in polymer science lies in the predictive design of new polymeric materials with targeted functionality. However, de novo design of functional polymers is challenging due to the vast chemical space and an incomplete understanding of structure-property relations. Recent advances in deep generative modeling have facilitated the efficient exploration of molecular design space, but data sparsity in polymer science is a major obstacle hindering progress. In this work, we introduce a vast polymer database known as the Open Macromolecular Genome (OMG), which contains synthesizable polymer chemistries compatible with known polymerization reactions and commercially available reactants selected for synthetic feasibility. The OMG is used in concert with a synthetically aware generative model known as Molecule Chef to identify property-optimized constitutional repeating units, constituent reactants, and reaction pathways of polymers, thereby advancing polymer design into the realm of synthetic relevance. As a proof-of-principle demonstration, we show that polymers with targeted octanol-water solubilities are readily generated together with monomer reactant building blocks and associated polymerization reactions. Suggested reactants are further integrated with Reaxys polymerization data to provide hypothetical reaction conditions (e.g., temperature, catalysts, and solvents). Broadly, the OMG is a polymer design approach capable of enabling data-intensive generative models for synthetic polymer design. Overall, this work represents a significant advance, enabling the property targeted design of synthetic polymers subject to practical synthetic constraints.