Structure/response correlations and similarity/diversity analysis by GETAWAY descriptors. 1. Theory of the novel 3D molecular descriptors

J Chem Inf Comput Sci. 2002 May-Jun;42(3):682-92. doi: 10.1021/ci015504a.

Abstract

Novel molecular descriptors based on a leverage matrix similar to that defined in statistics and usually used for regression diagnostics are presented. This leverage matrix, called Molecular Influence Matrix (MIM), is here proposed as a new molecular representation easily calculated from the spatial coordinates of the molecule atoms in a chosen conformation. The proposed molecular descriptors are called GETAWAY (GEometry, Topology, and Atom-Weights AssemblY) as they try to match 3D-molecular geometry provided by the molecular influence matrix and atom relatedness by molecular topology, with chemical information by using different atomic weightings (atomic mass, polarizability, van der Waals volume, and electronegativity, together with unit weights). A first set of molecular descriptors, called H-GETAWAY, is derived by using only the information provided by the molecular influence matrix, while a second set, called R-GETAWAY, combines this information with geometric interatomic distances in the molecule. The prediction ability in structure-property correlations of the new descriptors was tested by analyzing regressions of these descriptors for selected properties of octanes.