Resource Cut, a New Bounding Procedure to Algorithms for Enumerating Tree-Like Chemical Graphs

IEEE/ACM Trans Comput Biol Bioinform. 2019 Jan-Feb;16(1):77-90. doi: 10.1109/TCBB.2018.2832061. Epub 2018 May 1.

Abstract

Enumerating chemical compounds with given structural properties plays an important role in structure elucidation, with applications such as drug design. We focus on the problem of enumerating tree-like chemical graphs specified by upper and lower bounds on feature vectors, where chemical graphs represent compounds, and a feature vector characterizes frequencies of finite paths in a graph. Building on the branch-and-bound algorithm proposed in earlier work, we propose a new bounding procedure, called Resource Cut, to speed up the enumeration process. Tree-like chemical graphs are modeled as vertex-colored trees, colors representing chemical elements. The algorithm is based on a scheme of generating each unique colored tree with a specified number n of vertices. A colored tree is constructed by repeatedly appending vertices. Given a set R of n colored vertices, we found that the algorithm often constructs trees that cannot be extended to a unique representation of a colored tree no matter how the remaining unused colored vertices in the set R are appended. We derive a mathematical condition to detect and discard such trees. Experimental results show that Resource Cut significantly reduces the search space. We have been able to obtain exact numbers of chemical graphs with up to 17 vertices excluding hydrogen atoms.

MeSH terms

  • Algorithms*
  • Computational Biology / methods*
  • Databases, Chemical*
  • Molecular Structure
  • Software