Machine learning for the prediction of cognitive impairment in older adults

Front Neurosci. 2023 Apr 27:17:1158141. doi: 10.3389/fnins.2023.1158141. eCollection 2023.

Abstract

Objective: The purpose of this study was to develop and validate a predictive model of cognitive impairment in older adults based on a novel machine learning (ML) algorithm.

Methods: The complete data of 2,226 participants aged 60-80 years were extracted from the 2011-2014 National Health and Nutrition Examination Survey database. Cognitive abilities were assessed using a composite cognitive functioning score (Z-score) calculated using a correlation test among the Consortium to Establish a Registry for Alzheimer's Disease Word Learning and Delayed Recall tests, Animal Fluency Test, and the Digit Symbol Substitution Test. Thirteen demographic characteristics and risk factors associated with cognitive impairment were considered: age, sex, race, body mass index (BMI), drink, smoke, direct HDL-cholesterol level, stroke history, dietary inflammatory index (DII), glycated hemoglobin (HbA1c), Patient Health Questionnaire-9 (PHQ-9) score, sleep duration, and albumin level. Feature selection is performed using the Boruta algorithm. Model building is performed using ten-fold cross-validation, machine learning (ML) algorithms such as generalized linear model (GLM), random forest (RF), support vector machine (SVM), artificial neural network (ANN), and stochastic gradient boosting (SGB). The performance of these models was evaluated in terms of discriminatory power and clinical application.

Results: The study ultimately included 2,226 older adults for analysis, of whom 384 (17.25%) had cognitive impairment. After random assignment, 1,559 and 667 older adults were included in the training and test sets, respectively. A total of 10 variables such as age, race, BMI, direct HDL-cholesterol level, stroke history, DII, HbA1c, PHQ-9 score, sleep duration, and albumin level were selected to construct the model. GLM, RF, SVM, ANN, and SGB were established to obtain the area under the working characteristic curve of the test set subjects 0.779, 0.754, 0.726, 0.776, and 0.754. Among all models, the GLM model had the best predictive performance in terms of discriminatory power and clinical application.

Conclusions: ML models can be a reliable tool to predict the occurrence of cognitive impairment in older adults. This study used machine learning methods to develop and validate a well performing risk prediction model for the development of cognitive impairment in the elderly.

Keywords: NHANES; cognitive function; machine learning; older adults; prediction model.

Grants and funding

The study was supported by Key Science and Technology Brainstorm Project of Guangzhou (202103000027), National Key R&D Program of China (2020YFC2005700), and Guangdong Provincial Key Laboratory of Traditional Chinese Medicine Informatization (2021B1212040007).