Comprehensive assessment of deep generative architectures for de novo drug design

Brief Bioinform. 2022 Jan 17;23(1):bbab544. doi: 10.1093/bib/bbab544.

Abstract

Recently, deep learning (DL)-based de novo drug design represents a new trend in pharmaceutical research, and numerous DL-based methods have been developed for the generation of novel compounds with desired properties. However, a comprehensive understanding of the advantages and disadvantages of these methods is still lacking. In this study, the performances of different generative models were evaluated by analyzing the properties of the generated molecules in different scenarios, such as goal-directed (rediscovery, optimization and scaffold hopping of active compounds) and target-specific (generation of novel compounds for a given target) tasks. In overall, the DL-based models have significant advantages over the baseline models built by the traditional methods in learning the physicochemical property distributions of the training sets and may be more suitable for target-specific tasks. However, both the baselines and DL-based generative models cannot fully exploit the scaffolds of the training sets, and the molecules generated by the DL-based methods even have lower scaffold diversity than those generated by the traditional models. Moreover, our assessment illustrates that the DL-based methods do not exhibit obvious advantages over the genetic algorithm-based baselines in goal-directed tasks. We believe that our study provides valuable guidance for the effective use of generative models in de novo drug design.

Keywords: de novo drug design; deep learning; genetic algorithm; machine learning; molecular generation.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Algorithms
  • Deep Learning
  • Drug Design*
  • Drug Discovery / methods*