A new algorithm for cluster analysis of genomic methylation: the Helicobacter pylori case

Bioinformatics. 2008 Feb 1;24(3):383-8. doi: 10.1093/bioinformatics/btm621. Epub 2007 Dec 16.

Abstract

Motivation: The genomic methylation analysis is useful to type bacteria that have a high number of expressed type II methyltransferases. Methyltransferases are usually committed to Restriction and Modification (R-M) systems, in which the restriction endonuclease imposes high pressure on the expression of the cognate methyltransferase that hinder R-M system loss. Conventional cluster methods do not reflect this tendency. An algorithm was developed for dendrogram construction reflecting the propensity for conservation of R-M Type II systems.

Results: The new algorithm was applied to 52 Helicobacter pylori strains from different geographical regions and compared with conventional clustering methods. The algorithm works by first grouping strains that share a common minimum set of R-M systems and gradually adds strains according to the number of the R-M systems acquired. Dendrograms revealed a cluster of African strains, which suggest that R-M systems are present in H.pylori genome since its human host migrates from Africa.

Availability: The software files are available at http://www.ff.ul.pt/paginas/jvitor/Bioinformatics/MCRM_algorithm.zip.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Algorithms*
  • Base Sequence
  • Chromosome Mapping / methods*
  • Cluster Analysis*
  • DNA Methylation
  • DNA Restriction-Modification Enzymes / genetics*
  • DNA, Bacterial / genetics*
  • Helicobacter pylori / enzymology*
  • Helicobacter pylori / genetics*
  • Molecular Sequence Data

Substances

  • DNA Restriction-Modification Enzymes
  • DNA, Bacterial