Metagenomic Data Reveal Type I Polyketide Synthase Distributions Across Biomes

bioRxiv [Preprint]. 2023 Jan 11:2023.01.09.523365. doi: 10.1101/2023.01.09.523365.

Abstract

Microbial polyketide synthase (PKS) genes encode the biosynthesis of many biomedically important natural products, yet only a small fraction of nature's polyketide biosynthetic potential has been realized. Much of this potential originates from type I PKSs (T1PKSs), which can be delineated into different classes and subclasses based on domain organization and structural features of the compounds encoded. Notably, phylogenetic relationships among PKS ketosynthase (KS) domains provide a method to classify the larger and more complex genes in which they occur. Increased access to large metagenomic datasets from diverse habitats provides opportunities to assess T1PKS biosynthetic diversity and distributions through the analysis of KS domain sequences. Here, we used the webtool NaPDoS2 to detect and classify over 35,000 type I KS domains from 137 metagenomic data sets reported from eight diverse biomes. We found biome-specific separation with soils enriched in modular cis -AT and hybrid cis -AT KSs relative to other biomes and marine sediments enriched in KSs associated with PUFA and enediyne biosynthesis. By extracting full-length KS domains, we linked the phylum Actinobacteria to soil-specific enediyne and cis -AT clades and identified enediyne and monomodular KSs in phyla from which the associated compound classes have not been reported. These sequences were phylogenetically distinct from those associated with experimentally characterized PKSs suggesting novel structures or enzyme functions remain to be discovered. Lastly, we employed our metagenome-extracted KS domains to evaluate commonly used type I KS PCR primers and identified modifications that could increase the KS sequence diversity recovered from amplicon libraries.

Importance: Polyketides are a crucial source of medicines, agrichemicals, and other commercial products. Advances in our understanding of polyketide biosynthesis coupled with the accumulation of metagenomic sequence data provide new opportunities to assess polyketide biosynthetic potential across biomes. Here, we used the webtool NaPDoS2 to assess type I PKS diversity and distributions by detecting and classifying KS domains across 137 metagenomes. We show that biomes are differentially enriched in KS domain classes, providing a roadmap for future biodiscovery strategies. Further, KS phylogenies reveal both biome-specific clades that do not include biochemically characterized PKSs, highlighting the biosynthetic potential of poorly explored environments. The large metagenome-derived KS dataset allowed us to identify regions of commonly used type I KS PCR primers that could be modified to capture a larger extent of KS diversity. These results facilitate both the search for novel polyketides and our understanding of the biogeographical distribution of PKSs across earth's major biomes.

Publication types

  • Preprint