Machine learning models to predict onset of dementia: A label learning approach

Alzheimers Dement (N Y). 2019 Dec 10:5:918-925. doi: 10.1016/j.trci.2019.10.006. eCollection 2019.

Abstract

Introduction: The study objective was to build a machine learning model to predict incident mild cognitive impairment, Alzheimer's Disease, and related dementias from structured data using administrative and electronic health record sources.

Methods: A cohort of patients (n = 121,907) and controls (n = 5,307,045) was created for modeling using data within 2 years of patient's incident diagnosis date. Additional cohorts 3-8 years removed from index data are used for prediction. Training cohorts were matched on age, gender, index year, and utilization, and fit with a gradient boosting machine, lightGBM.

Results: Incident 2-year model quality on a held-out test set had a sensitivity of 47% and area-under-the-curve of 87%. In the 3-year model, the learned labels achieved 24% (71%), which dropped to 15% (72%) in year 8.

Discussion: The ability of the model to discriminate incident cases of dementia implies that it can be a worthwhile tool to screen patients for trial recruitment and patient management.

Keywords: Alzheimer's disease; Gradient boosting machine; Machine learning; Onset of dementia; Prediction.