Streptococcus thermophilus core genome: comparative genome hybridization study of 47 strains

Appl Environ Microbiol. 2008 Aug;74(15):4703-10. doi: 10.1128/AEM.00132-08. Epub 2008 Jun 6.

Abstract

A DNA microarray platform based on 2,200 genes from publicly available sequences was designed for Streptococcus thermophilus. We determined how single-nucleotide polymorphisms in the 65- to 75-mer oligonucleotide probe sequences affect the hybridization signals. The microarrays were then used for comparative genome hybridization (CGH) of 47 dairy S. thermophilus strains. An analysis of the exopolysaccharide genes in each strain confirmed previous findings that this class of genes is indeed highly variable. A phylogenetic tree based on the CGH data showed similar distances for most strains, indicating frequent recombination or gene transfer within S. thermophilus. By comparing genome sizes estimated from the microarrays and pulsed-field gel electrophoresis, the amount of unknown DNA in each strain was estimated. A core genome comprised of 1,271 genes detected in all 47 strains was identified. Likewise, a set of noncore genes detected in only some strains was identified. The concept of an industrial core genome is proposed. This is comprised of the genes in the core genome plus genes that are necessary in an applied industrial context.

MeSH terms

  • DNA, Bacterial / genetics*
  • Genome, Bacterial*
  • Nucleic Acid Hybridization*
  • Oligonucleotide Array Sequence Analysis*
  • Phylogeny
  • RNA, Bacterial / genetics
  • RNA, Ribosomal, 16S / genetics
  • RNA, Ribosomal, 23S / genetics
  • Streptococcus thermophilus / classification
  • Streptococcus thermophilus / genetics*

Substances

  • DNA, Bacterial
  • RNA, Bacterial
  • RNA, Ribosomal, 16S
  • RNA, Ribosomal, 23S

Associated data

  • GEO/GPL6369