Using machine learning to identify important predictors of COVID-19 infection prevention behaviors during the early phase of the pandemic

Patterns (N Y). 2022 Apr 8;3(4):100482. doi: 10.1016/j.patter.2022.100482. Epub 2022 Mar 9.

Abstract

Before vaccines for coronavirus disease 2019 (COVID-19) became available, a set of infection-prevention behaviors constituted the primary means to mitigate the virus spread. Our study aimed to identify important predictors of this set of behaviors. Whereas social and health psychological theories suggest a limited set of predictors, machine-learning analyses can identify correlates from a larger pool of candidate predictors. We used random forests to rank 115 candidate correlates of infection-prevention behavior in 56,072 participants across 28 countries, administered in March to May 2020. The machine-learning model predicted 52% of the variance in infection-prevention behavior in a separate test sample-exceeding the performance of psychological models of health behavior. Results indicated the two most important predictors related to individual-level injunctive norms. Illustrating how data-driven methods can complement theory, some of the most important predictors were not derived from theories of health behavior-and some theoretically derived predictors were relatively unimportant.

Keywords: COVID-19; health behaviors; machine learning; public goods dilemma; random forest; social norms.