A Comprehensive Machine Learning Framework for the Exact Prediction of the Age of Onset in Familial and Sporadic Alzheimer's Disease

Diagnostics (Basel). 2021 May 17;11(5):887. doi: 10.3390/diagnostics11050887.

Abstract

Machine learning (ML) algorithms are widely used to develop predictive frameworks. Accurate prediction of Alzheimer's disease (AD) age of onset (ADAOO) is crucial to investigate potential treatments, follow-up, and therapeutic interventions. Although genetic and non-genetic factors affecting ADAOO were elucidated by other research groups and ours, the comprehensive and sequential application of ML to provide an exact estimation of the actual ADAOO, instead of a high-confidence-interval ADAOO that may fall, remains to be explored. Here, we assessed the performance of ML algorithms for predicting ADAOO using two AD cohorts with early-onset familial AD and with late-onset sporadic AD, combining genetic and demographic variables. Performance of ML algorithms was assessed using the root mean squared error (RMSE), the R-squared (R2), and the mean absolute error (MAE) with a 10-fold cross-validation procedure. For predicting ADAOO in familial AD, boosting-based ML algorithms performed the best. In the sporadic cohort, boosting-based ML algorithms performed best in the training data set, while regularization methods best performed for unseen data. ML algorithms represent a feasible alternative to accurately predict ADAOO with little human intervention. Future studies may include predicting the speed of cognitive decline in our cohorts using ML.

Keywords: Alzheimer’s disease; PSEN1; age of onset; genetic isolates; machine learning; natural history; predictive genomics.