Robustness of Local Predictions in Atomistic Machine Learning Models

Sanggyu Chong; Federico Grasselli; Chiheb Ben Mahmoud; Joe D Morrow; Volker L Deringer; Michele Ceriotti

doi:10.1021/acs.jctc.3c00704

Robustness of Local Predictions in Atomistic Machine Learning Models

J Chem Theory Comput. 2023 Nov 28;19(22):8020-8031. doi: 10.1021/acs.jctc.3c00704. Epub 2023 Nov 10.

Authors

Sanggyu Chong¹, Federico Grasselli¹, Chiheb Ben Mahmoud¹, Joe D Morrow², Volker L Deringer², Michele Ceriotti¹

Affiliations

¹ Laboratory of Computational Science and Modeling, Institute of Materials, École Polytechnique Fédérale de Lausanne, Lausanne 1015, Switzerland.
² Department of Chemistry, Inorganic Chemistry Laboratory, University of Oxford, Oxford OX1 3QR, U.K.

Abstract

Machine learning (ML) models for molecules and materials commonly rely on a decomposition of the global target quantity into local, atom-centered contributions. This approach is convenient from a computational perspective, enabling large-scale ML-driven simulations with a linear-scaling cost and also allows for the identification and posthoc interpretation of contributions from individual chemical environments and motifs to complicated macroscopic properties. However, even though practical justifications exist for the local decomposition, only the global quantity is rigorously defined. Thus, when the atom-centered contributions are used, their sensitivity to the training strategy or the model architecture should be carefully considered. To this end, we introduce a quantitative metric, which we call the local prediction rigidity (LPR), that allows one to assess how robust the locally decomposed predictions of ML models are. We investigate the dependence of the LPR on the aspects of model training, particularly the composition of training data set, for a range of different problems from simple toy models to real chemical systems. We present strategies to systematically enhance the LPR, which can be used to improve the robustness, interpretability, and transferability of atomistic ML models.