Quantitative Prediction of Vertical Ionization Potentials from DFT via a Graph-Network-Based Delta Machine Learning Model Incorporating Electronic Descriptors

J Phys Chem A. 2023 Apr 20;127(15):3472-3483. doi: 10.1021/acs.jpca.2c08821. Epub 2023 Apr 4.

Abstract

While accurate wave function theories like CCSD(T) are capable of modeling molecular chemical processes, the associated steep computational scaling renders them intractable for treating large systems or extensive databases. In contrast, density functional theory (DFT) is much more computationally feasible yet often fails to quantitatively describe electronic changes in chemical processes. Herein, we report an efficient delta machine learning (ΔML) model that builds on the Connectivity-Based Hierarchy (CBH) scheme─an error correction approach based on systematic molecular fragmentation protocols─and achieves coupled cluster accuracy on vertical ionization potentials by correcting for deficiencies in DFT. The present study integrates concepts from molecular fragmentation, systematic error cancellation, and machine learning. First, we show that by using an electron population difference map, ionization sites within a molecule may be readily identified, and CBH correction schemes for ionization processes may be automated. As a central feature of our work, we employ a graph-based QM/ML model, which embeds atom-centered features describing CBH fragments into a computational graph to further increase accuracy for the prediction of vertical ionization potentials. In addition, we show that the incorporation of electronic descriptors from DFT, namely electron population difference features, improves model performance well beyond chemical accuracy (1 kcal/mol) to approach benchmark accuracy. While the raw DFT results are strongly dependent on the underlying functional used, for our best models, the performance is robust and much less dependent on the functional.