Comparisons of Molecular Structure Generation Methods Based on Fragment Assemblies and Genetic Graphs

J Chem Inf Model. 2021 Sep 27;61(9):4245-4258. doi: 10.1021/acs.jcim.1c00803. Epub 2021 Aug 18.

Abstract

The use of quantitative structure-property relationships (QSPRs) helps in predicting molecular properties for several decades, while the automatic design of new molecular structures is still emerging. The choice of algorithms to generate molecules is not obvious and is related to several factors such as the desired chemical diversity (according to an initial dataset's content) and the level of construction (the use of atoms, fragments, pattern-based methods). In this paper, we address the problem of molecular structure generation by revisiting two approaches: fragment-based methods (FMs) and genetic-based methods (GMs). We define a set of indices to compare generation methods on a specific task. New indices inform about the explored data space (coverage), compare how the data space is explored (representativeness), and quantifies the ratio of molecules satisfying requirements (generation specificity) without the use of a database composed of real chemicals as a reference. These indices were employed to compare generations of molecules fulfilling the desired property criterion, evaluated by QSPR.

MeSH terms

  • Algorithms*
  • Molecular Structure
  • Quantitative Structure-Activity Relationship*