Risk factors for cardiovascular disease in patients with metabolic-associated fatty liver disease: a machine learning approach

Cardiovasc Diabetol. 2022 Nov 12;21(1):240. doi: 10.1186/s12933-022-01672-9.

Abstract

Background: Nonalcoholic fatty liver disease is associated with an increased cardiovascular disease (CVD) risk, although the exact mechanism(s) are less clear. Moreover, the relationship between newly redefined metabolic-associated fatty liver disease (MAFLD) and CVD risk has been poorly investigated. Data-driven machine learning (ML) techniques may be beneficial in discovering the most important risk factors for CVD in patients with MAFLD.

Methods: In this observational study, the patients with MAFLD underwent subclinical atherosclerosis assessment and blood biochemical analysis. Patients were split into two groups based on the presence of CVD (defined as at least one of the following: coronary artery disease; myocardial infarction; coronary bypass grafting; stroke; carotid stenosis; lower extremities artery stenosis). The ML techniques were utilized to construct a model which could identify individuals with the highest risk of CVD. We exploited the multiple logistic regression classifier operating on the most discriminative patient's parameters selected by univariate feature ranking or extracted using principal component analysis (PCA). Receiver operating characteristic (ROC) curves and area under the ROC curve (AUC) were calculated for the investigated classifiers, and the optimal cut-point values were extracted from the ROC curves using the Youden index, the closest to (0, 1) criteria and the Index of Union methods.

Results: In 191 patients with MAFLD (mean age: 58, SD: 12 years; 46% female), there were 47 (25%) patients who had the history of CVD. The most important clinical variables included hypercholesterolemia, the plaque scores, and duration of diabetes. The five, ten and fifteen most discriminative parameters extracted using univariate feature ranking and utilized to fit the ML models resulted in AUC of 0.84 (95% confidence interval [CI]: 0.77-0.90, p < 0.0001), 0.86 (95% CI 0.80-0.91, p < 0.0001) and 0.87 (95% CI 0.82-0.92, p < 0.0001), whereas the classifier fitted over 10 principal components extracted using PCA followed by the parallel analysis obtained AUC of 0.86 (95% CI 0.81-0.91, p < 0.0001). The best model operating on 5 most discriminative features correctly identified 114/144 (79.17%) low-risk and 40/47 (85.11%) high-risk patients.

Conclusion: A ML approach demonstrated high performance in identifying MAFLD patients with prevalent CVD based on the easy-to-obtain patient parameters.

Keywords: Cardiovascular disease; Machine learning; Metabolic-associated fatty liver disease.

Publication types

  • Observational Study
  • Research Support, Non-U.S. Gov't

MeSH terms

  • Cardiovascular Diseases* / diagnosis
  • Cardiovascular Diseases* / epidemiology
  • Cardiovascular Diseases* / etiology
  • Female
  • Heart Disease Risk Factors
  • Humans
  • Liver Diseases* / complications
  • Machine Learning
  • Male
  • Middle Aged
  • Non-alcoholic Fatty Liver Disease* / complications
  • Risk Factors