Machine Learning Many-Body Green's Functions for Molecular Excitation Spectra

J Chem Theory Comput. 2024 Jan 9;20(1):143-154. doi: 10.1021/acs.jctc.3c01146. Epub 2023 Dec 27.

Abstract

We present a machine learning (ML) framework for predicting Green's functions of molecular systems, from which photoemission spectra and quasiparticle energies at quantum many-body level can be obtained. Kernel ridge regression is adopted to predict self-energy matrix elements on compact imaginary frequency grids from static and dynamical mean-field electronic features, which gives direct access to real-frequency many-body Green's functions through analytic continuation and Dyson's equation. Feature and self-energy matrices are represented in a symmetry-adapted intrinsic atomic orbital plus projected atomic orbital basis to enforce rotational invariance. We demonstrate good transferability and high data efficiency of the proposed ML method across molecular sizes and chemical species by showing accurate predictions of density of states (DOS) and quasiparticle energies at the level of many-body perturbation theory (GW) or full configuration interaction. For the ML model trained on 48 out of 1995 molecules randomly sampled from the QM7 and QM9 data sets, we report the mean absolute errors of ML-predicted highest occupied and lowest unoccupied molecular orbital energies to be 0.13 and 0.10 eV, respectively, compared to GW@PBE0. We further showcase the capability of this method by applying the same ML model to predict DOS for significantly larger organic molecules with up to 44 heavy atoms.