Boosting Graph Neural Networks with Molecular Mechanics: A Case Study of Sigma Profile Prediction

J Chem Theory Comput. 2023 Dec 26;19(24):9318-9328. doi: 10.1021/acs.jctc.3c01003. Epub 2023 Dec 8.

Abstract

Sigma profiles are quantum-chemistry-derived molecular descriptors that encode the polarity of molecules. They have shown great performance when used as a feature in machine learning applications. To accelerate the development of these models and the construction of large sigma profile databases, this work proposes a graph convolutional network (GCN) architecture to predict sigma profiles from molecule structures. To do so, the usage of molecular mechanics (force field atom types) is explored as a computationally inexpensive node-level featurization technique to encode the local and global chemical environments of atoms in molecules. The GCN models developed in this work accurately predict the sigma profiles of assorted organic and inorganic compounds. The best GCN model here reported, obtained using Merck molecular force field (MMFF) atom types, displayed training and testing set coefficients of determination of 0.98 and 0.96, respectively, which are superior to previous methodologies reported in the literature. This performance boost is shown to be due to both the usage of a convolutional architecture and node-level features based on force field atom types. Finally, to demonstrate their practical applicability, we used GCN-predicted sigma profiles as the input to machine learning models previously developed in the literature that predict boiling temperatures and aqueous solubilities. Using the predicted sigma profiles as input, these models were able to compute both physicochemical properties using significantly less computational resources and displayed only a slight decrease in performance when compared with sigma profiles obtained from quantum chemistry methods.