A multiclass extreme gradient boosting model for evaluation of transcriptomic biomarkers in Alzheimer's disease prediction

Neurosci Lett. 2024 Jan 31:821:137609. doi: 10.1016/j.neulet.2023.137609. Epub 2023 Dec 27.

Abstract

Background: Patients with young-onset Alzheimer's disease (AD) (before the age of 50 years old) often lack obvious imaging changes and amyloid protein deposition, which can lead to misdiagnosis with other cognitive impairments. Considering the association between immunological dysfunction and progression of neurodegenerative disease, recent research has focused on identifying blood transcriptomic signatures for precise prediction of AD.

Methods: In this study, we extracted blood biomarkers from large-scale transcriptomics to construct multiclass eXtreme Gradient Boosting models (XGBoost), and evaluated their performance in distinguishing AD from cognitive normal (CN) and mild cognitive impairment (MCI).

Results: Independent testing with external dataset revealed that the combination of blood transcriptomic signatures achieved an area under the receiver operating characteristic curve (AUC of ROC) of 0.81 for multiclass classification (sensitivity = 0.81; specificity = 0.63), 0.83 for classification of AD vs. CN (sensitivity = 0.72; specificity = 0.73), and 0.85 for classification of AD vs. MCI (sensitivity = 0.77; specificity = 0.73). These candidate signatures were significantly enriched in 62 chromosome regions, such as Chr.19p12-19p13.3, Chr.1p22.1-1p31.1, and Chr.1q21.2-1p23.1 (adjusted p < 0.05), and significantly overrepresented by 26 transcription factors, including E2F2, FOXO3, and GATA1 (adjusted p < 0.05). Biological analysis of these signatures pointed to systemic dysregulation of immune responses, hematopoiesis, exocytosis, and neuronal support in neurodegenerative disease (adjusted p < 0.05).

Conclusions: Blood transcriptomic biomarkers hold great promise in clinical use for the accurate assessment and prediction of AD.

Keywords: Alzheimer’s disease; Blood transcriptomic biomarkers; EXtreme Gradient Boosting; Machine learning; Multiclass classification.

MeSH terms

  • Alzheimer Disease* / diagnosis
  • Alzheimer Disease* / genetics
  • Biomarkers
  • Cognitive Dysfunction* / diagnosis
  • Cognitive Dysfunction* / genetics
  • Disease Progression
  • Gene Expression Profiling
  • Humans
  • Magnetic Resonance Imaging / methods
  • Middle Aged
  • Neurodegenerative Diseases*
  • Sensitivity and Specificity
  • Transcriptome

Substances

  • Biomarkers