Multiple linear regression models for predicting the n‑octanol/water partition coefficients in the SAMPL7 blind challenge

J Comput Aided Mol Des. 2021 Aug;35(8):923-931. doi: 10.1007/s10822-021-00409-2. Epub 2021 Jul 12.

Abstract

A multiple linear regression model called MLR-3 is used for predicting the experimental n-octanol/water partition coefficient (log PN) of 22 N-sulfonamides proposed by the organizers of the SAMPL7 blind challenge. The MLR-3 method was trained with 82 molecules including drug-like sulfonamides and small organic molecules, which resembled the main functional groups present in the challenge dataset. Our model, submitted as "TFE-MLR", presented a root-mean-square error of 0.58 and mean absolute error of 0.41 in log P units, accomplishing the highest accuracy, among empirical methods and also in all submissions based on the ranked ones. Overall, the results support the appropriateness of multiple linear regression approach MLR-3 for computing the n-octanol/water partition coefficient in sulfonamide-bearing compounds. In this context, the outstanding performance of empirical methodologies, where 75% of the ranked submissions achieved root-mean-square errors < 1 log P units, support the suitability of these strategies for obtaining accurate and fast predictions of physicochemical properties as partition coefficients of bioorganic compounds.

Keywords: Empirical methods; Multiple linear regression; N-sulfonamides; SAMPL7 blind challenge; n-Octanol/water partition coefficients.

MeSH terms

  • 1-Octanol / chemistry*
  • Computer Simulation*
  • Linear Models
  • Models, Chemical*
  • Quantum Theory*
  • Solubility
  • Thermodynamics*
  • Water / chemistry*

Substances

  • Water
  • 1-Octanol