Simple sequence repeat insertion induced stability and potential 'gain of function' in the proteins of extremophilic bacteria

Extremophiles. 2022 May 5;26(2):17. doi: 10.1007/s00792-022-01265-0.

Abstract

Here, we analysed the genomic evolution in extremophilic bacteria using long simple sequence repeats (SSRs). Frequencies of occurrence, relative abundance (RA) and relative density (RD) of long SSRs were analysed in the genomes of extremophilic bacteria. Thermus aquaticus had the most RA and RD of long SSRs in its coding sequences (110.6 and 1408.3), followed by Rhodoferax antarcticus (77.0 and 1187.4). A positive correlation was observed between G + C content and the RA-RD of long SSRs. Geobacillus kaustophilus, Geobacillus thermoleovorans, Halothermothrix orenii, R. antarcticus, and T. aquaticus preferred trinucleotide repeats within their genomes, whereas others preferred a higher number of tetranucleotide repeats. Gene enrichment showed the presence of these long SSRs in metabolic enzyme encoding genes related to stress tolerance. To analyse the functional implications of SSR insertions, three-dimensional protein structure modelling of SSR containing diguanylate cyclase (DGC) gene encoding protein was carried out. Removal of SSR sequence led to an inappropriate folding and instability of the modelled protein structure.

Keywords: Adaptation; Extremophiles; Functional genomics; Genome analysis; Simple sequence repeats.

MeSH terms

  • Bacteria / genetics
  • Base Composition
  • Extremophiles* / genetics
  • Gain of Function Mutation
  • Microsatellite Repeats