Identifying dementia from cognitive footprints in hospital records among Chinese older adults: a machine-learning study

Lancet Reg Health West Pac. 2024 Apr 12:46:101060. doi: 10.1016/j.lanwpc.2024.101060. eCollection 2024 May.

Abstract

Background: By combining theory-driven and data-driven methods, this study aimed to develop dementia predictive algorithms among Chinese older adults guided by the cognitive footprint theory.

Methods: Electronic medical records from the Clinical Data Analysis and Reporting System in Hong Kong were employed. We included patients with dementia diagnosed at 65+ between 2010 and 2018, and 1:1 matched dementia-free controls. We identified 51 features, comprising exposures to established modifiable factors and other factors before and after 65 years old. The performances of four machine learning models, including LASSO, Multilayer perceptron (MLP), XGBoost, and LightGBM, were compared with logistic regression models, for all patients and subgroups by age.

Findings: A total of 159,920 individuals (40.5% male; mean age [SD]: 83.97 [7.38]) were included. Compared with the model included established modifiable factors only (area under the curve [AUC] 0.689, 95% CI [0.684, 0.694]), the predictive accuracy substantially improved for models with all factors (0.774, [0.770, 0.778]). Machine learning and logistic regression models performed similarly, with AUC ranged between 0.773 (0.768, 0.777) for LASSO and 0.780 (0.776, 0.784) for MLP. Antipsychotics, education, antidepressants, head injury, and stroke were identified as the most important predictors in the total sample. Age-specific models identified different important features, with cardiovascular and infectious diseases becoming prominent in older ages.

Interpretation: The models showed satisfactory performances in identifying dementia. These algorithms can be used in clinical practice to assist decision making and allow timely interventions cost-effectively.

Funding: The Research Grants Council of Hong Kong under the Early Career Scheme 27110519.

Keywords: Cognitive footprints; Dementia; Electronic medical records; Machine learning.