Progress in visual representations of chemical space

Expert Opin Drug Discov. 2015;10(9):959-73. doi: 10.1517/17460441.2015.1060216. Epub 2015 Jun 22.

Abstract

Introduction: The concept of 'chemical space' reveals itself in two forms: the discrete set of all possible molecules, and multi-dimensional descriptor space encompassing all the possible molecules. Approaches based on this concept are widely used for the analysis and enumeration of compound databases, library design, and structure-activity relationships (SAR) and landscape studies. Visual representations of chemical space differ in their applicability domains and features and require expert knowledge for choosing the right tool for a particular problem.

Areas covered: In this review, the authors present recent advances in visualization of the chemical space in the framework of current general understanding of this topic. Attention is given to such methods as van Krevelen diagrams, descriptor plots, principal components analysis (PCA), self-organizing maps (SOM), generative topographic mapping (GTM), graph and network-based approaches. Notable application examples are provided.

Expert opinion: With the growth of computational power, representations of large datasets are becoming more and more common instruments in the toolboxes of chemoinformaticians. Every scientist in the field can find the method of choice for a particular task. However, there is no universal reference representation of the chemical space currently available and expert knowledge is required.

Keywords: big data; chemical space; chemoinformatics; data mining; visualization.

Publication types

  • Review

MeSH terms

  • Databases, Chemical
  • Drug Design*
  • Drug Discovery / methods
  • Humans
  • Models, Chemical*
  • Principal Component Analysis
  • Small Molecule Libraries
  • Structure-Activity Relationship

Substances

  • Small Molecule Libraries