Deep learning of dynamically responsive chemical Hamiltonians with semiempirical quantum mechanics

Proc Natl Acad Sci U S A. 2022 Jul 5;119(27):e2120333119. doi: 10.1073/pnas.2120333119. Epub 2022 Jul 1.

Abstract

Conventional machine-learning (ML) models in computational chemistry learn to directly predict molecular properties using quantum chemistry only for reference data. While these heuristic ML methods show quantum-level accuracy with speeds several orders of magnitude faster than traditional quantum chemistry methods, they suffer from poor extensibility and transferability; i.e., their accuracy degrades on large or new chemical systems. Incorporating quantum chemistry frameworks into the ML models directly solves this problem. Here we take the structure of semiempirical quantum mechanics (SEQM) methods to construct dynamically responsive Hamiltonians. SEQM methods use empirical parameters fitted to experimental properties to construct reduced-order Hamiltonians, facilitating much faster calculations than ab initio methods but with compromised accuracy. By replacing these static parameters with machine-learned dynamic values inferred from the local environment, we greatly improve the accuracy of the SEQM methods. Trained on molecular energies and atomic forces, these dynamically generated Hamiltonian parameters show a strong correlation with atomic hybridization and bonding. Trained with only about 60,000 small organic molecular conformers, the resulting model retains interpretability, extensibility, and transferability when testing on much larger chemical systems and predicting various molecular properties. Overall, this work demonstrates the virtues of incorporating physics-based descriptions with ML to develop models that are simultaneously accurate, transferable, and interpretable.

Keywords: Hamiltonian; machine learning; model transferability; semiempirical quantum chemistry.