Spatial machine learning for predicting physical inactivity prevalence from socioecological determinants in Chicago, Illinois, USA

J Geogr Syst. 2023 Jun 5:1-21. doi: 10.1007/s10109-023-00415-y. Online ahead of print.

Abstract

The increase in physical inactivity prevalence in the USA has been associated with neighborhood characteristics. While several studies have found an association between neighborhood and health, the relative importance of each component related to physical inactivity or how this value varies geographically (i.e., across different neighborhoods) remains unexplored. This study ranks the contribution of seven socioecological neighborhood factors to physical inactivity prevalence in Chicago, Illinois, using machine learning models at the census tract level, and evaluates their predictive capabilities. First, we use geographical random forest (GRF), a recently proposed nonlinear machine learning regression method that assesses each predictive factor's spatial variation and contribution to physical inactivity prevalence. Then, we compare the predictive performance of GRF to geographically weighted artificial neural networks, another recently proposed spatial machine learning algorithm. Our results suggest that poverty is the most important determinant in the Chicago tracts, while on the other hand, green space is the least important determinant in the rise of physical inactivity prevalence. As a result, interventions can be designed and implemented based on specific local circumstances rather than broad concepts that apply to Chicago and other large cities.

Supplementary information: The online version contains supplementary material available at 10.1007/s10109-023-00415-y.

Keywords: Behavioral health; Chicago; Neighborhood; Physical inactivity prevalence; Spatial machine learning model.