Dominant short repeated sequences in bacterial genomes

Genomics. 2015 Mar;105(3):175-81. doi: 10.1016/j.ygeno.2014.12.009. Epub 2015 Jan 3.

Abstract

We use a novel multidimensional searching approach to present the first exhaustive search for all possible repeated sequences in 166 genomes selected to cover the bacterial domain. We found an overrepresentation of repeated sequences in all but one of the genomes. The most prevalent repeats by far were related to interspaced short palindromic repeats (CRISPRs)—conferring bacterial adaptive immunity. We identified a deep branching clade of thermophilic Firmicutes containing the highest number of CRISPR repeats. We also identified a high prevalence of tandem repeated heptamers. In addition, we identified GC-rich repeats that could potentially be involved in recombination events. Finally, we identified repeats in a 16322 amino acid mega protein (involved in biofilm formation) and inverted repeats flanking miniature transposable elements (MITEs). In conclusion, the exhaustive search for repeated sequences identified new elements and distribution of these, which has implications for understanding both the ecology and evolution of bacteria.

Keywords: Bacteria; Repeated sequences.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Bacteria / genetics*
  • Clustered Regularly Interspaced Short Palindromic Repeats
  • Evolution, Molecular
  • Genome, Bacterial*
  • Repetitive Sequences, Nucleic Acid*