Deep generative design of porous organic cages via a variational autoencoder

Digit Discov. 2023 Oct 26;2(6):1925-1936. doi: 10.1039/d3dd00154g. eCollection 2023 Dec 4.

Abstract

Porous organic cages (POCs) are a class of porous molecular materials characterised by their tunable, intrinsic porosity; this functional property makes them candidates for applications including guest storage and separation. Typically formed via dynamic covalent chemistry reactions from multifunctionalised molecular precursors, there is an enormous potential chemical space for POCs due to the fact they can be formed by combining two relatively small organic molecules, which themselves have an enormous chemical space. However, identifying suitable molecular precursors for POC formation is challenging, as POCs often lack shape persistence (the cage collapses upon solvent removal with loss of its cavity), thus losing a key functional property (porosity). Generative machine learning models have potential for targeted computational design of large functional molecular systems such as POCs. Here, we present a deep-learning-enabled generative model, Cage-VAE, for the targeted generation of shape-persistent POCs. We demonstrate the capacity of Cage-VAE to propose novel, shape-persistent POCs, via integration with multiple efficient sampling methods, including Bayesian optimisation and spherical linear interpolation.