Growth by Insertion: The Family of Bacterial DDxP Proteins

Int J Mol Sci. 2020 Dec 2;21(23):9184. doi: 10.3390/ijms21239184.

Abstract

We have identified a variety of proteins in species of the Legionella, Aeromonas, Pseudomonas, Vibrio, Nitrosomonas, Nitrosospira, Variovorax, Halomonas, and Rhizobia genera, which feature repetitive modules of different length and composition, invariably ending at the COOH side with Asp-Asp-x-Pro (DDxP) motifs. DDxP proteins range in size from 900 to 6200 aa (amino acids), and contain 1 to 5 different module types, present in one or multiple copies. We hypothesize that DDxP proteins were modeled by the action of specific endonucleases inserting DNA segments into genes encoding DDxP motifs. Target site duplications (TSDs) formed upon repair of staggered ends generated by endonuclease cleavage would explain the DDxP motifs at repeat ends. TSDs acted eventually as targets for the insertion of more modules of the same or different types. Repeat clusters plausibly resulted from amplification of both repeat and flanking TSDs. The proposed growth shown by the insertion model is supported by the identification of homologous proteins lacking repeats in Pseudomonas and Rhizobium. The 85 DDxP repeats identified in this work vary in length, and can be sorted into short (136-215 aa) and long (243-304 aa) types. Conserved Asp-Gly-Asp-Gly-Asp motifs are located 11-19 aa from the terminal DDxP motifs in all repeats, and far upstream in most long repeats.

Keywords: Asp-rich motifs; Ca2+-binding sites; HGT; RTX toxins; bacterial adhesins; horizontal gene transfer; modular proteins; site-specific endonucleases; target site duplications; type I secretion systems.

MeSH terms

  • Amino Acid Motifs*
  • Amino Acid Sequence
  • Bacterial Physiological Phenomena*
  • Bacterial Proteins / chemistry
  • Bacterial Proteins / genetics
  • Bacterial Proteins / metabolism*
  • Base Sequence
  • Calcium / metabolism
  • Gene Transfer, Horizontal
  • Multigene Family
  • Phylogeny
  • Protein Domains*
  • Repetitive Sequences, Nucleic Acid
  • Species Specificity
  • Type I Secretion Systems / genetics
  • Type I Secretion Systems / metabolism

Substances

  • Bacterial Proteins
  • Type I Secretion Systems
  • Calcium