Simple sequence repeats in Haemophilus influenzae

Infect Genet Evol. 2009 Mar;9(2):216-28. doi: 10.1016/j.meegid.2008.11.006. Epub 2008 Nov 28.

Abstract

Simple sequence repeat (SSRs) of DNA are subject to high rates of mutation and are important mediators of adaptation in Haemophilus influenzae. Previous studies of the Rd KW20 genome identified the primacy of tetranucleotide SSRs in mediating phase variation (the rapid reversible switching of gene expression) of surface exposed structures such as lipopolysaccharide. The recent sequencing of the genomes of multiple strains of H. influenzae allowed the comparison of the SSRs (repeat units of one to nine nucleotides in length) in detail across four complete H. influenzae genomes and then comparison with a further 12 genomes when they became available. The SSR loci were broadly classified into three groups: (1) those that did not vary; (2) those for which some variation between strains was observed but this could not be linked to variation of gene expression; and (3) those that both varied and were located in regions consistent with mediating phase variable gene expression. Comparative analysis of 988 SSR associated loci confirmed that tetranucleotide repeats were the major mediators of phase variation and extended the repertoire of known tetranucleotide SSR loci by identifying ten previously uncharacterised tetranucleotide SSR loci with the potential to mediate phase variation which were unequally distributed across the H. influenzae pan-genome. Further, analysis of non-tetranucleotide SSR in the 16 strains revealed a number of mononucleotide, dinucleotide, pentanucleotide, heptanucleotide, and octanucleotide SSRs which were consistent with these tracts mediating phase variation. This study substantiates previous findings as to the important role that tetranucleotide SSRs play in H. influenzae biology. Two Brazilian isolates showed the most variation in their complement of SSRs suggesting the possibility of geographic and phenotypic influences on SSR distribution.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • DNA, Bacterial / genetics*
  • DNA, Bacterial / metabolism
  • Gene Expression Regulation, Bacterial
  • Genetic Variation
  • Genome, Bacterial / genetics*
  • Haemophilus influenzae / genetics*
  • Repetitive Sequences, Nucleic Acid / genetics*

Substances

  • DNA, Bacterial