Predicting pK(a) by molecular tree structured fingerprints and PLS

J Chem Inf Comput Sci. 2003 May-Jun;43(3):870-9. doi: 10.1021/ci020386s.

Abstract

This is the second phase of the pK(a) predictor published earlier (J. Chem. Inf. Comput. Sci. 2002, 42, 796-805). The algorithm has been extended by treating specific chemical classes separately and generating tree-structured molecular descriptors tailored to each individual class. A training set consisting of 625 acids and 412 bases covers the major areas of chemical space involved in protonation and deprotonation. The models obtained demonstrate excellent statistics (SE = 0.41 for acids and 0.30 for bases) and yielded accurate predictions on an external test set. The quality and statistical performance of pK(a) prediction has been improved considerably over the initial implementation of the method.

MeSH terms

  • Algorithms*
  • Amines / chemistry
  • Biological Availability
  • Carboxylic Acids / chemistry*
  • Computer Simulation
  • Databases, Factual
  • Forecasting
  • Heterocyclic Compounds / chemistry*
  • Hydrogen-Ion Concentration
  • Imidazoles / chemistry
  • Kinetics
  • Least-Squares Analysis
  • Models, Chemical*
  • Protons
  • Pyridines / chemistry

Substances

  • Amines
  • Carboxylic Acids
  • Heterocyclic Compounds
  • Imidazoles
  • Protons
  • Pyridines