A thorough analysis and categorization of bacterial interrupted adenylation domains, including previously unidentified families

RSC Chem Biol. 2020 Aug 18;1(4):233-250. doi: 10.1039/d0cb00092b. eCollection 2020 Oct 1.

Abstract

Interrupted adenylation (A) domains are key to the immense structural diversity seen in the nonribosomal peptide (NRP) class of natural products (NPs). Interrupted A domains are A domains that contain within them the catalytic portion of another domain, most commonly a methylation (M) domain. It has been well documented that methylation events occur with extreme specificity on either the backbone (N-) or side chain (O- or S-) of the amino acid (or amino acid-like) building blocks of NRPs. Here, through taxonomic and phylogenetic analyses as well as multiple sequence alignments, we evaluated the similarities and differences between interrupted A domains. We probed their taxonomic distribution amongst bacterial organisms, their evolutionary relatedness, and described conserved motifs of each type of M domain found to be embedded in interrupted A domains. Additionally, we categorized interrupted A domains and the M domains within them into a total of seven distinct families and six different types, respectively. The families of interrupted A domains include two new families, 6 and 7, that possess new architectures. Rather than being interrupted between the previously described a2-a3 or a8-a9 of the ten conserved A domain sequence motifs (a1-a10), family 6 contains an M domain between a6-a7, a previously unknown interruption site. Family 7 demonstrates that di-interrupted A domains exist in Nature, containing an M domain between a2-a3 as well as one between a6-a7, displaying a novel arrangement. These in-depth investigations of amino acid sequences deposited in the NCBI database highlighted the prevalence of interrupted A domains in bacterial organisms, with each family of interrupted A domains having a different taxonomic distribution. They also emphasized the importance of utilizing a broad range of bacteria for NP discovery. Categorization of the families of interrupted A domains and types of M domains allowed for a better understanding of the trends of naturally occurring interrupted A domains, which illuminated patterns and insights on how to harness them for future engineering studies.