Predicting Multimorbidity Using Saudi Health Indicators (Sharik) Nationwide Data: Statistical and Machine Learning Approach

Healthcare (Basel). 2023 Jul 31;11(15):2176. doi: 10.3390/healthcare11152176.

Abstract

The Saudi population is at high risk of multimorbidity. The risk of these morbidities can be reduced by identifying common modifiable behavioural risk factors. This study uses statistical and machine learning methods to predict factors for multimorbidity in the Saudi population. Data from 23,098 Saudi residents were extracted from the "Sharik" Health Indicators Surveillance System 2021. Participants were asked about their demographics and health indicators. Binary logistic models were used to determine predictors of multimorbidity. A backpropagation neural network model was further run using the predictors from the logistic regression model. Accuracy measures were checked using training, validation, and testing data. Females and smokers had the highest likelihood of experiencing multimorbidity. Age and fruit consumption also played a significant role in predicting multimorbidity. Regarding model accuracy, both logistic regression and backpropagation algorithms yielded comparable outcomes. The backpropagation method (accuracy 80.7%) was more accurate than the logistic regression model (77%). Machine learning algorithms can be used to predict multimorbidity among adults, particularly in the Middle East region. Different testing methods later validated the common predicting factors identified in this study. These factors are helpful and can be translated by policymakers to consider improvements in the public health domain.

Keywords: backpropagation methods; health indicators surveillance; logistic regression; multimorbidity; prediction.

Grants and funding

This research received no funding.