Uncertainty-driven dynamics for active learning of interatomic potentials

Nat Comput Sci. 2023 Mar;3(3):230-239. doi: 10.1038/s43588-023-00406-5. Epub 2023 Mar 6.

Abstract

Machine learning (ML) models, if trained to data sets of high-fidelity quantum simulations, produce accurate and efficient interatomic potentials. Active learning (AL) is a powerful tool to iteratively generate diverse data sets. In this approach, the ML model provides an uncertainty estimate along with its prediction for each new atomic configuration. If the uncertainty estimate passes a certain threshold, then the configuration is included in the data set. Here we develop a strategy to more rapidly discover configurations that meaningfully augment the training data set. The approach, uncertainty-driven dynamics for active learning (UDD-AL), modifies the potential energy surface used in molecular dynamics simulations to favor regions of configuration space for which there is large model uncertainty. The performance of UDD-AL is demonstrated for two AL tasks: sampling the conformational space of glycine and sampling the promotion of proton transfer in acetylacetone. The method is shown to efficiently explore the chemically relevant configuration space, which may be inaccessible using regular dynamical sampling at target temperature conditions.

MeSH terms

  • Fabaceae*
  • Glycine
  • Machine Learning
  • Molecular Dynamics Simulation
  • Uncertainty

Substances

  • Glycine