Predictive cheminformatics in drug discovery: statistical modeling for analysis of micro-array and gene expression data

Methods Mol Biol. 2012:910:165-94. doi: 10.1007/978-1-61779-965-5_9.

Abstract

The vast amounts of chemical and biological data available through robotic high-throughput assays and micro-array technologies require computational techniques for visualization, analysis, and predictive -modeling. Predictive cheminformatics and bioinformatics employ statistical methods to mine this data for hidden correlations and to retrieve molecules or genes with desirable biological activity from large databases, for the purpose of drug development. While many statistical methods are commonly employed and widely accessible, their proper use involves due consideration to data representation and preprocessing, model validation and domain of applicability estimation, similarity assessment, the nature of the structure-activity landscape, and model interpretation. This chapter seeks to review these considerations in light of the current state of the art in statistical modeling and to summarize the best practices in predictive cheminformatics.

Publication types

  • Review

MeSH terms

  • Computational Biology / methods*
  • Databases, Chemical*
  • Drug Discovery / methods*
  • Gene Expression Profiling / methods*
  • Microarray Analysis / methods*
  • Models, Statistical
  • Structure-Activity Relationship