Combining Group-Contribution Concept and Graph Neural Networks Toward Interpretable Molecular Property Models

J Chem Inf Model. 2023 Feb 13;63(3):725-744. doi: 10.1021/acs.jcim.2c01091. Epub 2023 Jan 30.

Abstract

Quantitative structure-property relationships (QSPRs) are important tools to facilitate and accelerate the discovery of compounds with desired properties. While many QSPRs have been developed, they are associated with various shortcomings such as a lack of generalizability and modest accuracy. Albeit various machine-learning and deep-learning techniques have been integrated into such models, another shortcoming has emerged in the form of a lack of transparency and interpretability of such models. In this work, two interpretable graph neural network (GNN) models (attentive group-contribution (AGC) and group-contribution-based graph attention (GroupGAT)) are developed by integrating fundamentals using the concept of group contributions (GC). The interpretability consists of highlighting the substructure with the highest attention weights in the latent representation of the molecules using the attention mechanism. The proposed models showcased better performance compared to classical group-contribution models, as well as against various other GNN models describing the aqueous solubility, melting point, and enthalpies of formation, combustion, and fusion of organic compounds. The insights provided are consistent with insights obtained from the semiempirical GC models confirming that the proposed framework allows highlighting the important substructures of the molecules for a specific property.

MeSH terms

  • Machine Learning*
  • Models, Molecular
  • Neural Networks, Computer*
  • Quantitative Structure-Activity Relationship