Metagenomic data reveals type I polyketide synthase distributions across biomes

mSystems. 2023 Jun 29;8(3):e0001223. doi: 10.1128/msystems.00012-23. Epub 2023 Jun 5.

Abstract

Microbial polyketide synthase (PKS) genes encode the biosynthesis of many biomedically or otherwise commercially important natural products. Despite extensive discovery efforts, metagenomic analyses suggest that only a small fraction of nature's polyketide biosynthetic potential has been realized. Much of this potential originates from type I PKSs (T1PKSs), which can be further delineated based on their domain organization and the structural features of the compounds they encode. Notably, phylogenetic relationships among ketosynthase (KS) domains provide an effective method to classify the larger and more complex T1PKS genes in which they occur. Increased access to large metagenomic data sets from diverse habitats provides opportunities to assess T1PKS biosynthetic diversity and distributions through their smaller and more tractable KS domain sequences. Here, we used the web tool NaPDoS2 to detect and classify over 35,000 type I KS domains from 137 metagenomic data sets reported from eight diverse, globally distributed biomes. We found biome-specific separation with soils enriched in KSs from modular cis-acetyltransferase (AT) and hybrid cis-AT KSs relative to other biomes and marine sediments enriched in KSs associated with polyunsaturated fatty acid and enediyne biosynthesis. We linked the phylum Actinobacteria to soil-derived enediyne and cis-AT KSs while marine-derived KSs associated with enediyne and monomodular PKSs were linked to phyla from which the compounds produced by these biosynthetic enzymes have not been reported. These KSs were phylogenetically distinct from those associated with experimentally characterized PKSs suggesting they may be associated with novel structures or enzyme functions. Finally, we employed our metagenome-extracted KS domains to evaluate the PCR primers commonly used to amplify type I KSs and identified modifications that could increase the KS sequence diversity recovered from amplicon libraries. IMPORTANCE Polyketides are a crucial source of medicines, agrichemicals, and other commercial products. Advances in our understanding of polyketide biosynthesis, coupled with the increased availability of metagenomic sequence data, provide new opportunities to assess polyketide biosynthetic potential across biomes. Here, we used the web tool NaPDoS2 to assess type I polyketide synthase (PKS) diversity and distributions by detecting and classifying ketosynthase (KS) domains across 137 metagenomes. We show that biomes are differentially enriched in type I KS domains, providing a roadmap for future biodiscovery strategies. Furthermore, KS phylogenies reveal biome-specific clades that do not include biochemically characterized PKSs, highlighting the biosynthetic potential of poorly explored environments. The large metagenome-derived KS data set allowed us to identify regions of commonly used type I KS PCR primers that could be modified to capture a larger extent of environmental KS diversity. These results facilitate both the search for novel polyketides and our understanding of the biogeographical distribution of PKSs across Earth's major biomes.

Keywords: NaPDoS2; biomes; biosynthetic diversity; metagenomes; natural products; polyketide synthase; specialized metabolites.

MeSH terms

  • Enediynes
  • Metagenome / genetics
  • Phylogeny
  • Polyketide Synthases* / genetics
  • Polyketides*

Substances

  • Polyketide Synthases
  • Polyketides
  • Enediynes