Factors to preserve CpG-rich sequences in methylated CpG islands

BMC Genomics. 2015 Feb 28;16(1):144. doi: 10.1186/s12864-015-1286-x.

Abstract

Background: Mammalian CpG islands (CGIs) normally escape DNA methylation in all adult tissues and developmental stages. However, in our previous study we unexpectedly identified many methylated CGIs in human peripheral blood leukocytes. Methylated CpG dinucleotides convert to TpG dinucleotides through deaminization of their cytosine bases more frequently than hypomethylated CpG dinucleotides. Therefore, we wondered how methylated CGIs in germline or non-germline cells maintain their CpG-rich sequences. It is known that events such as germline hypomethylation, CpG selection, biased gene conversion (BGC), and frequent CpG fixation can contribute to the maintenance of CpG-rich sequences in methylated CGIs in germline or non-germline cells. However, it has not been investigated which of the processes maintain CpG-rich sequences of methylated CGIs in each genomic position.

Results: In this study, we comprehensively examined the contribution of the processes described above to the maintenance of CpG-rich sequences in methylated CGIs in germline and non-germline cells which were classified by genomic positions. Approximately 60-80% of CGIs with high methylation in H1 cell line (H1-HM) in all the genomic positions showed a low average CpG→TpG/CpA substitution rate. In contrast, fewer than half the numbers of CGIs with H1-HM in all the genomic positions showed a low average CpG→TpG/CpA substitution rate and low levels of methylation in sperm cells (SPM-LM). Furthermore, a small fraction of CGIs with a low average CpG→TpG/CpA substitution rate and high levels of methylation in sperm cells (SPM-HM) showed CpG selection. On the other hand, independent of the positions in genes, most CGIs with SPM-HM showed a slightly higher average TpG/CpA→CpG substitution rate compared with those with SPM-LM.

Conclusions: Relatively high numbers (approximately 60-80%) of CGIs with H1-HM in all the genomic positions preserve their CpG-rich sequences by a low CpG→TpG/CpA substitution rate caused mainly by their SPM-LM, and for those with SPM-HM partly by CpG selection and TpG/CpA→CpG fixation. BGC has little contribution to the maintenance of CpG-rich sequences of CGIs with SPM-HM which were classified by genomic positions.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Animals
  • Base Composition
  • Cell Line
  • CpG Islands / genetics*
  • DNA Methylation / genetics*
  • Databases, Genetic
  • Genome
  • Humans
  • Male
  • Pan troglodytes / genetics
  • Spermatozoa / metabolism