Fast Prediction of Lipophilicity of Organofluorine Molecules: Deep Learning-Derived Polarity Characters and Experimental Tests

J Chem Inf Model. 2022 Oct 24;62(20):4928-4936. doi: 10.1021/acs.jcim.2c01201. Epub 2022 Oct 12.

Abstract

Fast and accurate estimation of lipophilicity for organofluorine molecules is in great demand for accelerating drug and materials discovery. A lipophilicity data set of organofluorine molecules (OFL data set), containing 1907 samples, is constructed through density functional theory (DFT) calculations and experimental measurements. An efficient and interpretable model, called PoLogP, is developed to predict the n-octanol/water partition coefficient, log Po/w, of organofluorine molecules on the basis of the descriptors of polarization, which is a combination of polarity descriptors, including the molecular polarity index and molecular polarizability (α), and hydrogen bond (HBs) index, consisting of the number of donors (NHBD) and acceptors (NHBA and NHB-FA). The present PoLogP with a combination of polarity descriptors is demonstrated to perform better than the dipole moment (μ) alone for the F-contained molecules. With the aid of a multilevel attention graph convolutional neural network model, the fast generation of polarity descriptors of organofluorine molecules could be achieved with the DFT accuracy based only on a topological molecular graph structure. The performance of PoLogP is further validated on synthesized organofluorine molecules and 2626 non-fluorinated molecules with satisfactory accuracy, highlighting the potential usage of PoLogP in high-throughput screening of the functional molecules with the desired solubility in various solvent media.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • 1-Octanol
  • Deep Learning*
  • Solubility
  • Solvents
  • Water / chemistry

Substances

  • 1-Octanol
  • Water
  • Solvents