Deep learning driven biosynthetic pathways navigation for natural products with BioNavi-NP

Nat Commun. 2022 Jun 10;13(1):3342. doi: 10.1038/s41467-022-30970-9.

Abstract

The complete biosynthetic pathways are unknown for most natural products (NPs), it is thus valuable to make computer-aided bio-retrosynthesis predictions. Here, a navigable and user-friendly toolkit, BioNavi-NP, is developed to predict the biosynthetic pathways for both NPs and NP-like compounds. First, a single-step bio-retrosynthesis prediction model is trained using both general organic and biosynthetic reactions through end-to-end transformer neural networks. Based on this model, plausible biosynthetic pathways can be efficiently sampled through an AND-OR tree-based planning algorithm from iterative multi-step bio-retrosynthetic routes. Extensive evaluations reveal that BioNavi-NP can identify biosynthetic pathways for 90.2% of 368 test compounds and recover the reported building blocks as in the test set for 72.8%, 1.7 times more accurate than existing conventional rule-based approaches. The model is further shown to identify biologically plausible pathways for complex NPs collected from the recent literature. The toolkit as well as the curated datasets and learned models are freely available to facilitate the elucidation and reconstruction of the biosynthetic pathways for NPs.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Algorithms
  • Biological Products*
  • Biosynthetic Pathways
  • Deep Learning*
  • Neural Networks, Computer

Substances

  • Biological Products