Array probe density and pathobiological relevant CpG calling bias in human disease and physiological DNA methylation profiling

Brief Funct Genomics. 2018 Jan 1;17(1):42-48. doi: 10.1093/bfgp/elx017.

Abstract

The HumanMethylation450 BeadChip array (450K; Infinium) is a widely used tool in epigenomics. A recognized concern in the 450K platform is the potential effect of the number of probes/gene (PG) on ranking differentially methylated (DM) CpGs (DM-CpGs) before testing for enrichment of gene ontology categories. We previously showed in a fatty acid (FA)-induced DNA methylation profiling study that when DM-CpGs are ranked by the number of called DM-CpGs-to-PG ratio, the 150 top-ranking gene list is enriched in pathways that overlap with the corresponding Affymetrix array-based expression data. In this study, a comparative analysis of thirteen 450K-based studies representing FA-stimulated cellular models, aging, diseased and normal tissues, revealed that the 150 top-ranking DM-CpGs are in high PG genes. This points to a significant false-negative rate in the low PG gene set when delta-beta-based ranking is performed. We show that PG is not related to the density of methylation-prone sites, as it does not follow gene length or GC content. Conversely, ranking genes by the number of DM-CpGs-to-PG ratio and analysing the 150 top-ranking entries yields significantly enriched gene disease- or tissue-specific function categories that are increased both in number and in the degree of overlap with expression data compared with delta-beta-only ranking or to the previously published gometh-based pipeline. The 15 top-ranking loci list is also significantly enriched in non-coding RNAs, a greatly underrepresented transcript type in 450K. In summary, the proposed simple normalization method yields pathobiologically relevant DM-CpGs. This method is relevant for the newly developed MethylationEPIC (Infinium) microarray.

Keywords: 450K Illumina DNA methylation arrays; 850K (EPIC) Illumina DNA methylation arrays; DNA methylation; epigenomics; microarray.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • CpG Islands / genetics*
  • DNA Methylation / genetics*
  • DNA Probes / metabolism
  • Disease / genetics*
  • Genetic Loci
  • Humans
  • Oligonucleotide Array Sequence Analysis / methods*

Substances

  • DNA Probes