Substructural Connectivity Fingerprint and Extreme Entropy Machines-A New Method of Compound Representation and Analysis

Molecules. 2018 May 23;23(6):1242. doi: 10.3390/molecules23061242.

Abstract

Key-based substructural fingerprints are an important element of computer-aided drug design techniques. The usefulness of the fingerprints in filtering compound databases is invaluable, as they allow for the quick rejection of molecules with a low probability of being active. However, this method is flawed, as it does not consider the connections between substructures. After changing the connections between particular chemical moieties, the fingerprint representation of the compound remains the same, which leads to difficulties in distinguishing between active and inactive compounds. In this study, we present a new method of compound representation-substructural connectivity fingerprints (SCFP), providing information not only about the presence of particular substructures in the molecule but also additional data on substructure connections. Such representation was analyzed by the recently developed methodology-extreme entropy machines (EEM). The SCFP can be a valuable addition to virtual screening tools, as it represents compound structure with greater detail and more specificity, allowing for more accurate classification.

Keywords: fingerprint; machine learning; molecular representation; substructures.

MeSH terms

  • Chemistry, Pharmaceutical
  • Computer-Aided Design
  • Databases, Factual
  • Databases, Pharmaceutical
  • Drug Design
  • Drug Evaluation, Preclinical
  • Entropy
  • Machine Learning
  • Molecular Structure
  • Small Molecule Libraries / chemistry*
  • Structure-Activity Relationship

Substances

  • Small Molecule Libraries