Abundance and distribution of the highly iterated palindrome 1 (HIP1) among prokaryotes

Mob Genet Elements. 2011 Sep;1(3):159-168. doi: 10.4161/mge.1.3.18300. Epub 2011 Sep 1.

Abstract

We have studied the abundance and phylogenetic distribution of the Highly Iterated Palindrome 1 (HIP1) among sequenced prokaryotic genomes. We show that an overrepresentation of HIP1 is exclusive of some lineages of cyanobacteria, and that this abundance was gained only once during evolution and was subsequently lost in the lineage leading to marine pico-cyanobacteria. We show that among cyanobacterial protein sequences with annotated Pfam domains, only OpcA (glucose 6-phosphate dehydrogenase assembly protein) has a phylogenetic distribution fully matching HIP1 abundance, suggesting a functional relationship; we also show that DAM methylase (an enzyme that has the four central nucleotides of HIP1 as is site of action) is present in all cyanobacterial genomes (independently of their HIP1 content) with the exception of marine pico-cyanobacteria whom might have lost this enzyme during the process of genome reduction. Our analyses also show that in some prokaryotic lineages (particularly in those species with large genomes), HIP1 is unevenly distributed between coding and non-coding DNA (being more common in coding regions; with the exception of Cyanobacteria Yellowstone B' and Synechococcus elongates where the reverse pattern is true). Finally, we explore the hypothesis that the HIP1 can be used as a molecular "water-mark" to identify horizontally transferred genes from cyanobacteria to other species.