Bioinformatics-Guided Expansion and Discovery of Graspetides

ACS Chem Biol. 2021 Dec 17;16(12):2787-2797. doi: 10.1021/acschembio.1c00672. Epub 2021 Nov 12.

Abstract

Graspetides are a class of ribosomally synthesized and post-translationally modified peptide natural products featuring ATP-grasp ligase-dependent formation of macrolactones/macrolactams. These modifications arise from serine, threonine, or lysine donor residues linked to aspartate or glutamate acceptor residues. Characterized graspetides include serine protease inhibitors such as the microviridins and plesiocin. Here, we report an update to Rapid ORF Description and Evaluation Online (RODEO) for the automated detection of graspetides, which identified 3,923 high-confidence graspetide biosynthetic gene clusters. Sequence and co-occurrence analyses doubled the number of graspetide groups from 12 to 24, defined based on core consensus sequence and putative secondary modification. Bioinformatic analyses of the ATP-grasp ligase superfamily suggest that extant graspetide synthetases diverged once from an ancestral ATP-grasp ligase and later evolved to introduce a variety of ring connectivities. Furthermore, we characterized thatisin and iso-thatisin, two graspetides related by conformational stereoisomerism from Lysobacter antibioticus. Derived from a newly identified graspetide group, thatisin and iso-thatisin feature two interlocking macrolactones with identical ring connectivity, as determined by a combination of tandem mass spectrometry (MS/MS), methanolytic, and mutational analyses. NMR spectroscopy of thatisin revealed a cis conformation for a key proline residue, while molecular dynamics simulations, solvent-accessible surface area calculations, and partial methanolytic analysis coupled with MS/MS support a trans conformation for iso-thatisin at the same position. Overall, this work provides a comprehensive overview of the graspetide landscape, and the improved RODEO algorithm will accelerate future graspetide discoveries by enabling open-access analysis of existing and emerging genomes.

Publication types

  • Research Support, N.I.H., Extramural
  • Research Support, Non-U.S. Gov't

MeSH terms

  • Biological Products / chemistry*
  • Computational Biology / methods*
  • Ligases / chemistry*
  • Molecular Conformation
  • Multigene Family
  • Peptides / chemistry*
  • Protein Processing, Post-Translational
  • Ribosomes
  • Serine Proteinase Inhibitors / chemistry*
  • Tandem Mass Spectrometry

Substances

  • Biological Products
  • Peptides
  • Serine Proteinase Inhibitors
  • Ligases