Prediction of Alzheimer's disease using blood gene expression data

Taesic Lee; Hyunju Lee

doi:10.1038/s41598-020-60595-1

Prediction of Alzheimer's disease using blood gene expression data

Sci Rep. 2020 Feb 26;10(1):3485. doi: 10.1038/s41598-020-60595-1.

Authors

Taesic Lee¹, Hyunju Lee^{2

3

4}

Affiliations

¹ Department of Biomedical Science and Engineering, Gwangju Institute of Science and Technology, Gwangju, South Korea.
² Department of Biomedical Science and Engineering, Gwangju Institute of Science and Technology, Gwangju, South Korea. hyunjulee@gist.ac.kr.
³ Artificial Intelligence Graduate School, Gwangju Institute of Science and Technology, Gwangju, South Korea. hyunjulee@gist.ac.kr.
⁴ School of Electrical Engineering and Computer Science, Gwangju Institute of Science and Technology, Gwangju, South Korea. hyunjulee@gist.ac.kr.

Abstract

Identification of AD (Alzheimer's disease)-related genes obtained from blood samples is crucial for early AD diagnosis. We used three public datasets, ADNI, AddNeuroMed1 (ANM1), and ANM2, for this study. Five feature selection methods and five classifiers were used to curate AD-related genes and discriminate AD patients, respectively. In the internal validation (five-fold cross-validation within each dataset), the best average values of the area under the curve (AUC) were 0.657, 0.874, and 0.804 for ADNI, ANMI, and ANM2, respectively. In the external validation (training and test sets from different datasets), the best AUCs were 0.697 (training: ADNI to testing: ANM1), 0.764 (ADNI to ANM2), 0.619 (ANM1 to ADNI), 0.79 (ANM1 to ANM2), 0.655 (ANM2 to ADNI), and 0.859 (ANM2 to ANM1), respectively. These results suggest that although the classification performance of ADNI is relatively lower than that of ANM1 and ANM2, classifiers trained using blood gene expression can be used to classify AD for other data sets. In addition, pathway analysis showed that AD-related genes were enriched with inflammation, mitochondria, and Wnt signaling pathways. Our study suggests that blood gene expression data are useful in predicting the AD classification.

Publication types

Research Support, Non-U.S. Gov't

MeSH terms

Alzheimer Disease / blood
Alzheimer Disease / diagnosis*
Area Under Curve
Databases, Factual
Gene Expression Regulation / physiology*
Humans
ROC Curve
Support Vector Machine