Revealing Alzheimer's disease genes spectrum in the whole-genome by machine learning

BMC Neurol. 2018 Jan 10;18(1):5. doi: 10.1186/s12883-017-1010-3.

Abstract

Background: Alzheimer's disease (AD) is an important, progressive neurodegenerative disease, with a complex genetic architecture. A key goal of biomedical research is to seek out disease risk genes, and to elucidate the function of these risk genes in the development of disease. For this purpose, expanding the AD-associated gene set is necessary. In past research, the prediction methods for AD related genes has been limited in their exploration of the target genome regions. We here present a genome-wide method for AD candidate genes predictions.

Methods: We present a machine learning approach (SVM), based upon integrating gene expression data with human brain-specific gene network data, to discover the full spectrum of AD genes across the whole genome.

Results: We classified AD candidate genes with an accuracy and the area under the receiver operating characteristic (ROC) curve of 84.56% and 94%. Our approach provides a supplement for the spectrum of AD-associated genes extracted from more than 20,000 genes in a genome wide scale.

Conclusions: In this study, we have elucidated the whole-genome spectrum of AD, using a machine learning approach. Through this method, we expect for the candidate gene catalogue to provide a more comprehensive annotation of AD for researchers.

Keywords: Alzheimer’s disease; Gene; Machine learning.

MeSH terms

  • Alzheimer Disease / genetics*
  • Area Under Curve
  • Genome-Wide Association Study / methods*
  • Humans
  • Machine Learning*
  • ROC Curve
  • Sensitivity and Specificity