Prediction of pKa values for aliphatic carboxylic acids and alcohols with empirical atomic charge descriptors

J Chem Inf Model. 2006 Nov-Dec;46(6):2256-66. doi: 10.1021/ci060129d.

Abstract

Two quantitative pKa prediction models for aliphatic carboxylic acids and for alcohols were developed by multiple linear-regression (MLR) analysis with empirical atomic descriptors. The acid and alcohol molecules were described by a set of five and four atomic descriptors, respectively. For the pKa model of 1122 aliphatic carboxylic acids, the squared correlation coefficient is 0.813 with a standard error of prediction of 0.423; for the pKa model of 288 alcohols, the squared correlation coefficient is 0.817 with a standard error of prediction of 0.755, respectively. The good predictive abilities of the models obtained were indicated by both cross-validation and by external validation. An atomic descriptor was developed to model the inductive effect of the neighboring atoms for a central atom in a molecule. The ability of the descriptor to measure the inductive effect of substituent groups was demonstrated by a good correlation of this descriptor with Taft sigma* constants in aliphatic carboxylic acids. It provides a new approach to estimate Taft sigma* constants directly from molecular structures. An algorithm using Kohonen neural networks for splitting a data set into a training set and a test set is also presented.

MeSH terms

  • Alcohols / chemistry*
  • Algorithms
  • Carboxylic Acids / chemistry*
  • Chemistry, Organic / methods
  • Hydrogen Bonding
  • Hydrogen-Ion Concentration*
  • Linear Models
  • Models, Chemical
  • Models, Statistical
  • Molecular Structure
  • Neural Networks, Computer

Substances

  • Alcohols
  • Carboxylic Acids