Ischemic stroke prediction using machine learning in elderly Chinese population: The Rugao Longitudinal Ageing Study

Brain Behav. 2023 Dec;13(12):e3307. doi: 10.1002/brb3.3307. Epub 2023 Nov 7.

Abstract

Objective: Compared logistic regression (LR) with machine learning (ML) models, to predict the risk of ischemic stroke in an elderly population in China.

Methods: We applied 2208 records from the Rugao Longitudinal Ageing Study (RLAS) for ischemic stroke risk prediction assessment. Input variables included 103 phenotypes. For 3-year ischemic stroke risk prediction, we compared the discrimination and calibration of LR model and ML methods, where ML methods include Random Forest (RF), Gaussian kernel Support Vector Machines (SVM), Multilayer perceptron (MLP), K-Nearest Neighbors Algorithm (KNN), and Gradient Boosting Decision Tree (GBDT) to develop an ischemic stroke risk prediction model.

Results: Age, pulse, waist circumference, education level, β2-microglobulin, homocysteine, cystatin C, folate, free triiodothyronine, platelet distribution width, QT interval, and QTc interval were significant induced predictors of ischemic stroke. For ischemic stroke prediction, the ML approach was able to tap more biochemical and ECG-related multidimensional phenotypic indicators compared to the LR model, which placed more importance on general demographic indicators. Compared to the LR model, SVM provided the best discrimination and calibration (C-index: 0.79 vs. 0.71, 11.27% improvement in model utility), with the best performance in both validation and test data.

Conclusion: In a comparison of LR with five ML models, the accuracy of ischemic stroke prediction was higher by combining ML with multiple phenotypes. Combined with other studies based on elderly populations in China, ML techniques, especially SVM, have shown good long-term predictive performance, inspiring the potential value of ML use in clinical practice.

Keywords: ischemic stroke; logistic regression; machine learning; prediction; risk factors.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Aged
  • Aging
  • Algorithms
  • China / epidemiology
  • Humans
  • Ischemic Stroke*
  • Machine Learning