A Novel Method for Inferring Chemical Compounds With Prescribed Topological Substructures Based on Integer Programming

IEEE/ACM Trans Comput Biol Bioinform. 2022 Nov-Dec;19(6):3233-3245. doi: 10.1109/TCBB.2021.3112598. Epub 2022 Dec 8.

Abstract

Drug discovery is one of the major goals of computational biology and bioinformatics. A novel framework has recently been proposed for the design of chemical graphs using both artificial neural networks (ANNs) and mixed integer linear programming (MILP). This method consists of a prediction phase and an inverse prediction phase. In the first phase, an ANN is trained using data on existing chemical compounds. In the second phase, given a target chemical property, a feature vector is inferred by solving an MILP formulated from the trained ANN and then a set of chemical structures is enumerated by a graph enumeration algorithm. Although exact solutions are guaranteed by this framework, the types of chemical graphs have been restricted to such classes as trees, monocyclic graphs, and graphs with a specified polymer topology with cycle index up to 2. To overcome the limitation on the topological structure, we propose a new flexible modeling method to the framework so that we can specify a topological substructure of graphs and a partial assignment of chemical elements and bond-multiplicity to a target graph. The results of computational experiments suggest that the proposed system can infer chemical graphs with around up to 50 non-hydrogen atoms.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Algorithms*
  • Computational Biology / methods
  • Drug Discovery
  • Neural Networks, Computer*
  • Programming, Linear