AI in Healthcare: Time-Series Forecasting Using Statistical, Neural, and Ensemble Architectures

Shruti Kaushik; Abhinav Choudhury; Pankaj Kumar Sheron; Nataraj Dasgupta; Sayee Natarajan; Larry A Pickett; Varun Dutt

doi:10.3389/fdata.2020.00004

AI in Healthcare: Time-Series Forecasting Using Statistical, Neural, and Ensemble Architectures

Front Big Data. 2020 Mar 19:3:4. doi: 10.3389/fdata.2020.00004. eCollection 2020.

Authors

Shruti Kaushik¹, Abhinav Choudhury¹, Pankaj Kumar Sheron¹, Nataraj Dasgupta², Sayee Natarajan², Larry A Pickett², Varun Dutt¹

Affiliations

¹ Applied Cognitive Science Laboratory, Indian Institute of Technology Mandi, Mandi, India.
² RxDataScience, Inc., Durham, NC, United States.

Abstract

Both statistical and neural methods have been proposed in the literature to predict healthcare expenditures. However, less attention has been given to comparing predictions from both these methods as well as ensemble approaches in the healthcare domain. The primary objective of this paper was to evaluate different statistical, neural, and ensemble techniques in their ability to predict patients' weekly average expenditures on certain pain medications. Two statistical models, persistence (baseline) and autoregressive integrated moving average (ARIMA), a multilayer perceptron (MLP) model, a long short-term memory (LSTM) model, and an ensemble model combining predictions of the ARIMA, MLP, and LSTM models were calibrated to predict the expenditures on two different pain medications. In the MLP and LSTM models, we compared the influence of shuffling of training data and dropout of certain nodes in MLPs and nodes and recurrent connections in LSTMs in layers during training. Results revealed that the ensemble model outperformed the persistence, ARIMA, MLP, and LSTM models across both pain medications. In general, not shuffling the training data and adding the dropout helped the MLP models and shuffling the training data and not adding the dropout helped the LSTM models across both medications. We highlight the implications of using statistical, neural, and ensemble methods for time-series forecasting of outcomes in the healthcare domain.

Keywords: autoregressive integrated moving average (ARIMA); ensemble; long short-term memory (LSTM); medicine expenditures; multilayer perceptron (MLP); neural networks; persistence; time-series forecasting.