Using Computationally-Determined Properties for Machine Learning Prediction of Self-Diffusion Coefficients in Pure Liquids

J Phys Chem B. 2021 Dec 2;125(47):12990-13002. doi: 10.1021/acs.jpcb.1c07092. Epub 2021 Nov 18.

Abstract

The ability to predict transport properties of liquids quickly and accurately will greatly improve our understanding of fluid properties both in bulk and complex mixtures, as well as in confined environments. Such information could then be used in the design of materials and processes for applications ranging from energy production and storage to manufacturing processes. As a first step, we consider the use of machine learning (ML) methods to predict the diffusion properties of pure liquids. Recent results have shown that Artificial Neural Networks (ANNs) can effectively predict the diffusion of pure compounds based on the use of experimental properties as the model inputs. In the current study, a similar ANN approach is applied to modeling diffusion of pure liquids using fluid properties obtained exclusively from molecular simulations. A diverse set of 102 pure liquids is considered, ranging from small polar molecules (e.g., water) to large nonpolar molecules (e.g., octane). Self-diffusion coefficients were obtained from classical molecular dynamics (MD) simulations. Since nearly all the molecules are organic compounds, a general set of force field parameters for organic molecules was used. The MD methods are validated by comparing physical and thermodynamic properties with experiment. Computational input features for the ANN include physical properties obtained from the MD simulations as well as molecular properties from quantum calculations of individual molecules. Fluid properties describing the local liquid structure were obtained from center of mass radial distribution functions (COM-RDFs). Feature sensitivity analysis revealed that isothermal compressibility, heat of vaporization, and the thermal expansion coefficient were the most impactful properties used as input for the ANN model to predict the MD simulated self-diffusion coefficients. The MD-based ANN successfully predicts the MD self-diffusion coefficients with only a subset (2 to 3) of the available computationally determined input features required. A separate ANN model was developed using literature experimental self-diffusion coefficients as model targets. Although this second ML model was not as successful due to a limited number of data points, a good correlation is still observed between experimental and ML predicted self-diffusion coefficients.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Diffusion
  • Machine Learning
  • Molecular Dynamics Simulation*
  • Thermodynamics
  • Water*

Substances

  • Water