DNA Methylation Markers for Pan-Cancer Prediction by Deep Learning

Genes (Basel). 2019 Oct 4;10(10):778. doi: 10.3390/genes10100778.

Abstract

For cancer diagnosis, many DNA methylation markers have been identified. However, few studies have tried to identify DNA methylation markers to diagnose diverse cancer types simultaneously, i.e., pan-cancers. In this study, we tried to identify DNA methylation markers to differentiate cancer samples from the respective normal samples in pan-cancers. We collected whole genome methylation data of 27 cancer types containing 10,140 cancer samples and 3386 normal samples, and divided all samples into five data sets, including one training data set, one validation data set and three test data sets. We applied machine learning to identify DNA methylation markers, and specifically, we constructed diagnostic prediction models by deep learning. We identified two categories of markers: 12 CpG markers and 13 promoter markers. Three of 12 CpG markers and four of 13 promoter markers locate at cancer-related genes. With the CpG markers, our model achieved an average sensitivity and specificity on test data sets as 92.8% and 90.1%, respectively. For promoter markers, the average sensitivity and specificity on test data sets were 89.8% and 81.1%, respectively. Furthermore, in cell-free DNA methylation data of 163 prostate cancer samples, the CpG markers achieved the sensitivity as 100%, and the promoter markers achieved 92%. For both marker types, the specificity of normal whole blood was 100%. To conclude, we identified methylation markers to diagnose pan-cancers, which might be applied to liquid biopsy of cancers.

Keywords: biomarker, methylation, pan-cancer, deep learning, CpG, promoter.

MeSH terms

  • Biomarkers, Tumor / genetics*
  • CpG Islands / genetics
  • DNA Methylation / genetics
  • Deep Learning
  • Epigenesis, Genetic / genetics
  • Forecasting
  • Genetic Markers
  • Genetic Testing / methods
  • Humans
  • Machine Learning
  • Neoplasms / classification*
  • Neoplasms / genetics*
  • Promoter Regions, Genetic
  • Sensitivity and Specificity

Substances

  • Biomarkers, Tumor
  • Genetic Markers