Genomic charting of ribosomally synthesized natural product chemical space facilitates targeted mining

Proc Natl Acad Sci U S A. 2016 Oct 18;113(42):E6343-E6351. doi: 10.1073/pnas.1609014113. Epub 2016 Oct 3.

Abstract

Microbial natural products are an evolved resource of bioactive small molecules, which form the foundation of many modern therapeutic regimes. Ribosomally synthesized and posttranslationally modified peptides (RiPPs) represent a class of natural products which have attracted extensive interest for their diverse chemical structures and potent biological activities. Genome sequencing has revealed that the vast majority of genetically encoded natural products remain unknown. Many bioinformatic resources have therefore been developed to predict the chemical structures of natural products, particularly nonribosomal peptides and polyketides, from sequence data. However, the diversity and complexity of RiPPs have challenged systematic investigation of RiPP diversity, and consequently the vast majority of genetically encoded RiPPs remain chemical "dark matter." Here, we introduce an algorithm to catalog RiPP biosynthetic gene clusters and chart genetically encoded RiPP chemical space. A global analysis of 65,421 prokaryotic genomes revealed 30,261 RiPP clusters, encoding 2,231 unique products. We further leverage the structure predictions generated by our algorithm to facilitate the genome-guided discovery of a molecule from a rare family of RiPPs. Our results provide the systematic investigation of RiPP genetic and chemical space, revealing the widespread distribution of RiPP biosynthesis throughout the prokaryotic tree of life, and provide a platform for the targeted discovery of RiPPs based on genome sequencing.

Keywords: chemical space; cheminformatics; genome mining; natural product discovery; ribosomally synthesized natural product.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Algorithms
  • Biological Products*
  • Cluster Analysis
  • Computational Biology / methods*
  • Genomics* / methods
  • Markov Chains
  • Peptides / genetics
  • Peptides / metabolism
  • Prokaryotic Cells / physiology
  • Protein Biosynthesis / genetics*
  • Protein Processing, Post-Translational
  • Reproducibility of Results
  • Ribosomes / metabolism*

Substances

  • Biological Products
  • Peptides