"Molecular Anatomy": a new multi-dimensional hierarchical scaffold analysis tool

J Cheminform. 2021 Jul 23:13:54. doi: 10.1186/s13321-021-00526-y. eCollection 2021.

Abstract

The scaffold representation is widely employed to classify bioactive compounds on the basis of common core structures or correlate compound classes with specific biological activities. In this paper, we present a novel approach called "Molecular Anatomy" as a flexible and unbiased molecular scaffold-based metrics to cluster large set of compounds. We introduce a set of nine molecular representations at different abstraction levels, combined with fragmentation rules, to define a multi-dimensional network of hierarchically interconnected molecular frameworks. We demonstrate that the introduction of a flexible scaffold definition and multiple pruning rules is an effective method to identify relevant chemical moieties. This approach allows to cluster together active molecules belonging to different molecular classes, capturing most of the structure activity information, in particular when libraries containing a huge number of singletons are analyzed. We also propose a procedure to derive a network visualization that allows a full graphical representation of compounds dataset, permitting an efficient navigation in the scaffold's space and significantly contributing to perform high quality SAR analysis. The protocol is freely available as a web interface at https://ma.exscalate.eu .

Keywords: Clustering; HTS data analysis; Library design; Network visualization; Scaffold analysis.