Improved Lipophilicity and Aqueous Solubility Prediction with Composite Graph Neural Networks

Molecules. 2021 Oct 13;26(20):6185. doi: 10.3390/molecules26206185.

Abstract

The accurate prediction of molecular properties, such as lipophilicity and aqueous solubility, are of great importance and pose challenges in several stages of the drug discovery pipeline. Machine learning methods, such as graph-based neural networks (GNNs), have shown exceptionally good performance in predicting these properties. In this work, we introduce a novel GNN architecture, called directed edge graph isomorphism network (D-GIN). It is composed of two distinct sub-architectures (D-MPNN, GIN) and achieves an improvement in accuracy over its sub-architectures employing various learning, and featurization strategies. We argue that combining models with different key aspects help make graph neural networks deeper and simultaneously increase their predictive power. Furthermore, we address current limitations in assessment of deep-learning models, namely, comparison of single training run performance metrics, and offer a more robust solution.

Keywords: AI deep-learning; cheminformatics; computational chemistry; graph neural-networks; lipophilicity; machine-learning; molecular property; neural-networks; solubility.