Enumerating treelike chemical graphs with given path frequency

J Chem Inf Model. 2008 Jul;48(7):1345-57. doi: 10.1021/ci700385a. Epub 2008 Jun 28.

Abstract

The enumeration of chemical graphs satisfying given constraints is one of the fundamental problems in chemoinformatics. In this paper, we consider the problem of enumerating (i.e., listing) all treelike chemical graphs from a given path frequency. We propose an exact algorithm for enumerating all solutions to this problem on the basis of the branch-and-bound method. To further improve the efficiency of the enumeration, we introduce a new variant of the compound enumeration problem by adding a specification on the number of multiple bonds to the input and design another exact enumeration algorithm. The experimental results show that our algorithms can efficiently solve instances with larger sizes that are impossible to solve by the previous methods. In particular, we apply the latter algorithm to the enumeration problem of the special treelike chemical structures-alkane isomers. The theoretical and experimental results show that our algorithm works at least as fast as the state-of-the-art algorithms specially designed for generating alkane isomers, however using much less memory space.