Unsupervised Machine Learning to Identify Separable Clinical Alzheimer's Disease Sub-Populations

Jayant Prakash; Velda Wang; Robert E Quinn 3rd; Cassie S Mitchell

doi:10.3390/brainsci11080977

Unsupervised Machine Learning to Identify Separable Clinical Alzheimer's Disease Sub-Populations

Brain Sci. 2021 Jul 23;11(8):977. doi: 10.3390/brainsci11080977.

Authors

Jayant Prakash^{1

2}, Velda Wang¹, Robert E Quinn 3rd^{1

2}, Cassie S Mitchell^{1

3}

Affiliations

¹ Laboratory for Pathology Dynamics, Department of Biomedical Engineering, Georgia Institute of Technology and Emory University School of Medicine, Atlanta, GA 30332, USA.
² Department of Computer Science, Georgia Institute of Technology, Atlanta, GA 30332, USA.
³ Center for Machine Learning, Georgia Institute of Technology, Atlanta, GA 30332, USA.

Abstract

Heterogeneity among Alzheimer's disease (AD) patients confounds clinical trial patient selection and therapeutic efficacy evaluation. This work defines separable AD clinical sub-populations using unsupervised machine learning. Clustering (t-SNE followed by k-means) of patient features and association rule mining (ARM) was performed on the ADNIMERGE dataset from the Alzheimer's Disease Neuroimaging Initiative (ADNI). Patient sociodemographics, brain imaging, biomarkers, cognitive tests, and medication usage were included for analysis. Four AD clinical sub-populations were identified using between-cluster mean fold changes [cognitive performance, brain volume]: cluster-1 represented least severe disease [+17.3, +13.3]; cluster-0 [-4.6, +3.8] and cluster-3 [+10.8, -4.9] represented mid-severity sub-populations; cluster-2 represented most severe disease [-18.4, -8.4]. ARM assessed frequently occurring pharmacologic substances within the 4 sub-populations. No drug class was associated with the least severe AD (cluster-1), likely due to lesser antecedent disease. Anti-hyperlipidemia drugs associated with cluster-0 (mid-severity, higher volume). Interestingly, antioxidants vitamin C and E associated with cluster-3 (mid-severity, higher cognition). Anti-depressants like Zoloft associated with most severe disease (cluster-2). Vitamin D is protective for AD, but ARM identified significant underutilization across all AD sub-populations. Identification and feature characterization of four distinct AD sub-population "clusters" using standard clinical features enhances future clinical trial selection criteria and cross-study comparative analysis.

Keywords: Alzheimer’s disease; clinical trial design; drug repurposing; machine learning; population analysis; risk factors.

Abstract

Grants and funding