Prediction Model for Hypertension and Diabetes Mellitus Using Korean Public Health Examination Data (2002-2017)

Diagnostics (Basel). 2022 Aug 14;12(8):1967. doi: 10.3390/diagnostics12081967.

Abstract

Hypertension and diabetes mellitus are major chronic diseases that are important factors in the management of cardiovascular disease. In order to prevent the occurrence of chronic diseases, proper health management through periodic health check-ups is necessary. The purpose of this study is to determine the incidence of hypertension and diabetes mellitus according to the health check-up, and to develop a predictive model for hypertension and diabetes according to the health check-up. We used the National Health Insurance Corporation database of Korea and checked whether hypertension or diabetes occurred from that date according to the number of health check-ups over the past 10 years. Compared to those who underwent five health check-ups, those who participated in the first screening had hypertension (OR = 2.18, 95% CI = 2.14-2.22), diabetes mellitus (OR = 1.33, 95% CI = 1.30-1.35) and both diseases (OR = 2.46, 95% CI = 2.39-2.53); individuals who underwent 10 screenings had hypertension (OR = 0.86, 95% CI = 0.83-0.88), diabetes mellitus (OR = 0.83, 95% CI = 0.81-0.85) and both diseases (OR = 0.83, 95% CI = 0.79-0.87). Individuals who attended fewer than five screenings compared with individuals who attended five or more screenings had hypertension (OR = 1.61, 95% CI = 1.59-1.62; AUC = 0.66), diabetes mellitus (OR = 1.21, 95% CI = 1.20-1.22; AUC = 0.59) and both diseases (OR = 1.75, 95% CI = 1.72-1.78, AUC = 0.63). The machine learning-based prediction model using XGBoost showed higher performance in all datasets than the conventional logistic regression model in predicting hypertension (accuracy, 0.828 vs. 0.628; F1-score, 0.800 vs. 0.633; AUC, 828 vs. 0.630), diabetes mellitus (accuracy, 0.707 vs. 0.575; F1-score, 0.663 vs. 0.576; AUC, 0.710 vs. 0.575) and both diseases (accuracy, 0.950 vs. 0.612; F1-score, 0.950 vs. 0.614; AUC, 0.952 vs. 0.612). It was found that health check-up had a great influence on the occurrence of hypertension and diabetes, and screening frequency was more important than other factors in the variable importances.

Keywords: XGBoost; diabetes mellitus; health check-up; hypertension; logistic regression; random forest.