Improvement of Time Forecasting Models Using Machine Learning for Future Pandemic Applications Based on COVID-19 Data 2020-2022

Diagnostics (Basel). 2023 Mar 15;13(6):1121. doi: 10.3390/diagnostics13061121.

Abstract

Improving forecasts, particularly the accuracy, efficiency, and precision of time-series forecasts, is becoming critical for authorities to predict, monitor, and prevent the spread of the Coronavirus disease. However, the results obtained from the predictive models are imprecise and inefficient because the dataset contains linear and non-linear patterns, respectively. Linear models such as autoregressive integrated moving average cannot be used effectively to predict complex time series, so nonlinear approaches are better suited for such a purpose. Therefore, to achieve a more accurate and efficient predictive value of COVID-19 that is closer to the true value of COVID-19, a hybrid approach was implemented. Therefore, the objectives of this study are twofold. The first objective is to propose intelligence-based prediction methods to achieve better prediction results called autoregressive integrated moving average-least-squares support vector machine. The second objective is to investigate the performance of these proposed models by comparing them with the autoregressive integrated moving average, support vector machine, least-squares support vector machine, and autoregressive integrated moving average-support vector machine. Our investigation is based on three COVID-19 real datasets, i.e., daily new cases data, daily new death cases data, and daily new recovered cases data. Then, statistical measures such as mean square error, root mean square error, mean absolute error, and mean absolute percentage error were performed to verify that the proposed models are better than the autoregressive integrated moving average, support vector machine model, least-squares support vector machine, and autoregressive integrated moving average-support vector machine. Empirical results using three recent datasets of known the Coronavirus Disease-19 cases in Malaysia show that the proposed model generates the smallest mean square error, root mean square error, mean absolute error, and mean absolute percentage error values for training and testing datasets compared to the autoregressive integrated moving average, support vector machine, least-squares support vector machine, and autoregressive integrated moving average-support vector machine models. This means that the predicted value of the proposed model is closer to the true value. These results demonstrate that the proposed model can generate estimates more accurately and efficiently. Compared to the autoregressive integrated moving average, support vector machine, least-squares support vector machine, and autoregressive integrated moving average-support vector machine models, our proposed models perform much better in terms of percent error reduction for both training and testing all datasets. Therefore, the proposed model is possibly the most efficient and effective way to improve prediction for future pandemic performance with a higher level of accuracy and efficiency.

Keywords: COVID-19 pandemic; accuracy and efficiency; forecasting; hybrid models; machine learning; public health.

Grants and funding

The publication is partially sponsored by the Research Management Office, Universiti Malaysia Terengganu (UMT).