Hyper-graph based sparse canonical correlation analysis for the diagnosis of Alzheimer's disease from multi-dimensional genomic data

Methods. 2021 May:189:86-94. doi: 10.1016/j.ymeth.2020.04.008. Epub 2020 Apr 28.

Abstract

The effective and accurate diagnosis of Alzheimer's disease (AD), especially in the early stage (i.e., mild cognitive impairment (MCI)) remains a big challenge in AD research. So far, multiple biomarkers have been associated with AD diagnosis and progression. However, most of the existing research only utilized single modality data for diagnostic biomarker identification, which did not take the advantages of multi-modal data that provide comprehensive and complementary information at multiple levels into consideration. In this paper, we integrate multi-modal genomic data from postmortem AD brains (i.e., mRNA, miRNA and epigenomic data) and propose a hyper-graph based sparse canonical correlation analysis (HGSCCA) method to extract the most correlated multi-modal biomarkers associated with AD and MCI. Specifically, our model utilizes the sparse canonical correlation analysis framework (SCCA), which aims at finding the best linear projections for each input modality so that the strongest correlation within the selected features of multi-dimensional genomic data can be captured. In addition, with the consideration of high-order relationships among different subjects, we also introduce a hyper-graph-based regularization term that will lead to the selection of more discriminative biomarkers. To evaluate the effectiveness of the proposed method, we conduct the experiments on the well-known AD cohort study, The Religious Orders Study and Memory and Aging Project (ROSMAP) dataset, and the results show that our method can not only identify meaningful biomarkers for the diagnosis AD disease, but also achieve superior classification performance than the comparing methods.

Keywords: Alzheimer’s disease; Diagnostic biomarker; Hyper-graph; Multi-modal; Sparse canonical correlation Analysis.

Publication types

  • Research Support, N.I.H., Extramural
  • Research Support, Non-U.S. Gov't

MeSH terms

  • Aged
  • Aged, 80 and over
  • Alzheimer Disease / diagnosis
  • Alzheimer Disease / genetics*
  • Epigenomics
  • Female
  • Genomics / methods*
  • Humans
  • Male
  • Multivariate Analysis