Active Learning for Multi-way Sensitivity Analysis with Application to Disease Screening Modeling

Mucahit Cevik; Sabrina Angco; Elham Heydarigharaei; Hadi Jahanshahi; Nicholas Prayogo

doi:10.1007/s41666-022-00117-y

Active Learning for Multi-way Sensitivity Analysis with Application to Disease Screening Modeling

J Healthc Inform Res. 2022 Jul 15;6(3):317-343. doi: 10.1007/s41666-022-00117-y. eCollection 2022 Sep.

Authors

Mucahit Cevik¹, Sabrina Angco¹, Elham Heydarigharaei¹, Hadi Jahanshahi¹, Nicholas Prayogo¹

Affiliation

¹ Toronto Metropolitan University, 44 Gerrard St E, Toronto, M5B 1G3 Ontario Canada.

Abstract

Sensitivity analysis is an important aspect of model development as it can be used to assess the level of confidence that is associated with the outcomes of a study. In many practical problems, sensitivity analysis involves evaluating a large number of parameter combinations which may require an extensive amount of time and resources. However, such a computational burden can be avoided by identifying smaller subsets of parameter combinations that can be later used to generate the desired outcomes for other parameter combinations. In this study, we investigate machine learning-based approaches for speeding up the sensitivity analysis. Furthermore, we apply feature selection methods to identify the relative importance of quantitative model parameters in terms of their predictive ability on the outcomes. Finally, we highlight the effectiveness of active learning strategies in improving the sensitivity analysis processes by reducing the total number of quantitative model runs required to construct a high-performance prediction model. Our experiments on two datasets obtained from the sensitivity analysis performed for two disease screening modeling studies indicate that ensemble methods such as Random Forests and XGBoost consistently outperform other machine learning algorithms in the prediction task of the associated sensitivity analysis. In addition, we note that active learning can lead to significant speed-ups in sensitivity analysis by enabling the selection of more useful parameter combinations (i.e., instances) to be used for prediction models.

Keywords: Active learning; Disease screening; Machine learning; Regression; Sensitivity analysis.