Prediction model of obstructive sleep apnea-related hypertension: Machine learning-based development and interpretation study

Front Cardiovasc Med. 2022 Dec 5:9:1042996. doi: 10.3389/fcvm.2022.1042996. eCollection 2022.

Abstract

Background: Obstructive sleep apnea (OSA) is a globally prevalent disease closely associated with hypertension. To date, no predictive model for OSA-related hypertension has been established. We aimed to use machine learning (ML) to construct a model to analyze risk factors and predict OSA-related hypertension.

Materials and methods: We retrospectively collected the clinical data of OSA patients diagnosed by polysomnography from October 2019 to December 2021 and randomly divided them into training and validation sets. A total of 1,493 OSA patients with 27 variables were included. Independent risk factors for the risk of OSA-related hypertension were screened by the multifactorial logistic regression models. Six ML algorithms, including the logistic regression (LR), the gradient boosting machine (GBM), the extreme gradient boosting (XGBoost), adaptive boosting (AdaBoost), bootstrapped aggregating (Bagging), and the multilayer perceptron (MLP), were used to develop the model on the training set. The validation set was used to tune the model hyperparameters to determine the final prediction model. We compared the accuracy and discrimination of the models to identify the best machine learning algorithm for predicting OSA-related hypertension. In addition, a web-based tool was developed to promote its clinical application. We used permutation importance and Shapley additive explanations (SHAP) to determine the importance of the selected features and interpret the ML models.

Results: A total of 18 variables were selected for the models. The GBM model achieved the most extraordinary discriminatory ability (area under the receiver operating characteristic curve = 0.873, accuracy = 0.885, sensitivity = 0.713), and on the basis of this model, an online tool was built to help clinicians optimize OSA-related hypertension patient diagnosis. Finally, age, family history of hypertension, minimum arterial oxygen saturation, body mass index, and percentage of time of SaO2 < 90% were revealed by the SHAP method as the top five critical variables contributing to the diagnosis of OSA-related hypertension.

Conclusion: We established a risk prediction model for OSA-related hypertension patients using the ML method and demonstrated that among the six ML models, the gradient boosting machine model performs best. This prediction model could help to identify high-risk OSA-related hypertension patients, provide early and individualized diagnoses and treatment plans, protect patients from the serious consequences of OSA-related hypertension, and minimize the burden on society.

Keywords: Shapley additive explanations; gradient boosting machine (GBM); hypertension; machine learning; obstructive sleep apnea; risk factor.