Using Graph Databases to Investigate Trends in Structure-Activity Relationship Networks

J Chem Inf Model. 2020 Dec 28;60(12):6120-6134. doi: 10.1021/acs.jcim.0c00947. Epub 2020 Nov 27.

Abstract

Mining the steadily increasing amount of chemical and biological data is a key challenge in drug discovery. Graph databases offer viable alternatives for capturing interrelationships between molecules and for generating novel insights for design. In a graph database, molecules and their properties are mapped to nodes, while relationships are described by edges. Here, we introduce a graph database for navigation in chemical space, analogue searching, and structure-activity relationship (SAR) analysis. We illustrate this concept using hERG channel inhibitors from ChEMBL to extract SAR knowledge. This graph database is built using different relationships, namely 2D-fingerprint similarity, matched molecular pairs, topomer distances, and structure-activity landscape indices (SALI). Typical applications include retrieving analogues linked by single or multiple edge paths to the query compound as well as detection of nonadditive SAR features. Finally, we identify triplets of linked molecules for clustering. The speed of searching and analysis allows the user to interactively navigate the database and to address complex questions in real-time.

MeSH terms

  • Cluster Analysis
  • Databases, Factual
  • Drug Discovery*
  • Structure-Activity Relationship