Genome wide classification and characterisation of CpG sites in cancer and normal cells

Comput Biol Med. 2016 Jan 1:68:57-66. doi: 10.1016/j.compbiomed.2015.09.023. Epub 2015 Oct 23.

Abstract

This study identifies common methylation patterns across different cancer types in an effort to identify common molecular events in diverse types of cancer cells and provides evidence for the sequence surrounding a CpG to influence its susceptibility to aberrant methylation. CpG sites throughout the genome were divided into four classes: sites that either become hypo or hyper-methylated in a variety cancers using all the freely available microarray data (HypoCancer and HyperCancer classes) and those found in a constant hypo (Never methylated class) or hyper-methylated (Always methylated class) state in both normal and cancer cells. Our data shows that most CpG sites included in the HumanMethylation450K microarray remain unmethylated in normal and cancerous cells; however, certain sites in all the cancers investigated become specifically modified. More detailed analysis of the sites revealed that majority of those in the never methylated class were in CpG islands whereas those in the HyperCancer class were mostly associated with miRNA coding regions. The sites in the Hypermethylated class are associated with genes involved in initiating or maintaining the cancerous state, being enriched for processes involved in apoptosis, and with transcription factors predicted to bind to these genes linked to apoptosis and tumourgenesis (notably including E2F). Further we show that more LINE elements are associated with the HypoCancer class and more Alu repeats are associated with the HyperCancer class. Motifs that classify the classes were identified to distinguish them based on the surrounding DNA sequence alone, and for the identification of DNA sequences that could render sites more prone to aberrant methylation in cancer cells. This provides evidence that the sequence surrounding a CpG site has an influence on whether a site is hypo or hyper methylated.

Keywords: Computational analysis; CpG; DNA sequence; Methylation in cancer; Motif; Pattern identification; Pattern searching algorithm.

MeSH terms

  • Alu Elements*
  • Animals
  • CpG Islands*
  • DNA Methylation
  • DNA, Neoplasm / genetics*
  • Genome-Wide Association Study / methods*
  • Humans
  • MicroRNAs / genetics
  • Neoplasm Proteins / genetics*
  • Neoplasms / genetics*
  • RNA, Neoplasm / genetics*

Substances

  • DNA, Neoplasm
  • MicroRNAs
  • Neoplasm Proteins
  • RNA, Neoplasm