Tales from the war on error: the art and science of curating QSAR data

J Comput Aided Mol Des. 2015 Sep;29(9):897-910. doi: 10.1007/s10822-015-9865-0. Epub 2015 Aug 20.

Abstract

Curating the data underlying quantitative structure-activity relationship models is a never-ending struggle. Some curation can now be automated but much cannot, especially where data as complex as those pertaining to molecular absorption, distribution, metabolism, excretion, and toxicity are concerned (vide infra). The authors discuss some particularly challenging problem areas in terms of specific examples involving experimental context, incompleteness of data, confusion of units, problematic nomenclature, tautomerism, and misapplication of automated structure recognition tools.

Keywords: Automated structure recognition; Cytochrome P450; Data curation; Metabolism; Nomenclature; QSAR; Tautomerism.

MeSH terms

  • Chlorpromazine / chemistry
  • Chlorpromazine / pharmacokinetics
  • Cytochrome P-450 Enzyme System / metabolism
  • Data Accuracy
  • Data Curation*
  • Isomerism
  • Methylergonovine / chemistry
  • Midazolam / analogs & derivatives
  • Midazolam / chemistry
  • Molecular Structure
  • Quantitative Structure-Activity Relationship*
  • Terminology as Topic
  • Thermodynamics
  • Transition Temperature

Substances

  • 4-hydroxymidazolam
  • Cytochrome P-450 Enzyme System
  • Midazolam
  • Chlorpromazine
  • Methylergonovine