Prediction of cardiovascular disease risk based on major contributing features

Sci Rep. 2023 Mar 23;13(1):4778. doi: 10.1038/s41598-023-31870-8.

Abstract

The risk of cardiovascular disease (CVD) is a serious health threat to human society worldwide. The use of machine learning methods to predict the risk of CVD is of great relevance to identify high-risk patients and take timely interventions. In this study, we propose the XGBH machine learning model, which is a CVD risk prediction model based on key contributing features. In this paper, the generalisation of the model was enhanced by adding retrospective data of 14,832 Chinese Shanxi CVD patients to the kaggle dataset. The XGBH risk prediction model proposed in this paper was validated to be highly accurate (AUC = 0.81) compared to the baseline risk score (AUC = 0.65), and the accuracy of the model for CVD risk prediction was improved with the inclusion of the conventional biometric BMI variable. To increase the clinical application of the model, a simpler diagnostic model was designed in this paper, which requires only three characteristics from the patient (age, value of systolic blood pressure and whether cholesterol is normal or not) to enable early intervention in the treatment of high-risk patients with a slight reduction in accuracy (AUC = 0.79). Ultimately, a CVD risk score model with few features and high accuracy will be established based on the main contributing features. Of course, further prospective studies, as well as studies with other populations, are needed to assess the actual clinical effectiveness of the XGBH risk prediction model.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Body Mass Index
  • Cardiovascular Diseases* / diagnosis
  • Cardiovascular Diseases* / epidemiology
  • Female
  • Humans
  • Male
  • Mass Screening
  • Middle Aged
  • Probability
  • Retrospective Studies
  • Risk Factors