Construction of Xinjiang metabolic syndrome risk prediction model based on interpretable models

BMC Public Health. 2022 Feb 8;22(1):251. doi: 10.1186/s12889-022-12617-y.

Abstract

Background: We aimed to construct simple and practical metabolic syndrome (MetS) risk prediction models based on the data of inhabitants of Urumqi and to provide a methodological reference for the prevention and control of MetS.

Methods: This is a cross-sectional study conducted in the Xinjiang Uygur Autonomous Region of China. We collected data from inhabitants of Urumqi from 2018 to 2019, including demographic characteristics, anthropometric indicators, living habits and family history. Resampling technology was used to preprocess the data imbalance problems, and then MetS risk prediction models were constructed based on logistic regression (LR) and decision tree (DT). In addition, nomograms and tree diagrams of DT were used to explain and visualize the model.

Results: Of the 25,542 participants included in the study, 3,267 (12.8%) were diagnosed with MetS, and 22,275 (87.2%) were diagnosed with non-MetS. Both the LR and DT models based on the random undersampling dataset had good AUROC values (0.846 and 0.913, respectively). The accuracy, sensitivity, specificity, and AUROC values of the DT model were higher than those of the LR model. Based on a random undersampling dataset, the LR model showed that exercises such as walking (OR=0.769) and running (OR= 0.736) were protective factors against MetS. Age 60 ~ 74 years (OR=1.388), previous diabetes (OR=8.902), previous hypertension (OR=2.830), fatty liver (OR=3.306), smoking (OR=1.541), high systolic blood pressure (OR=1.044), and high diastolic blood pressure (OR=1.072) were risk factors for MetS; the DT model had 7 depth layers and 18 leaves, with BMI as the root node of the DT being the most important factor affecting MetS, and the other variables in descending order of importance: SBP, previous diabetes, previous hypertension, DBP, fatty liver, smoking, and exercise.

Conclusions: Both DT and LR MetS risk prediction models have good prediction performance and their respective characteristics. Combining these two methods to construct an interpretable risk prediction model of MetS can provide methodological references for the prevention and control of MetS.

Keywords: Interpretable model; Metabolic syndrome; Prediction; Risk factors.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Cross-Sectional Studies
  • Diabetes Mellitus*
  • Fatty Liver*
  • Humans
  • Hypertension* / epidemiology
  • Metabolic Syndrome*
  • Middle Aged
  • Risk Factors