Understanding cross-data dynamics of individual and social/environmental factors through a public health lens: explainable machine learning approaches

Front Public Health. 2023 Oct 26:11:1257861. doi: 10.3389/fpubh.2023.1257861. eCollection 2023.

Abstract

Introduction: The rising prevalence of obesity has become a public health concern, requiring efficient and comprehensive prevention strategies.

Methods: This study innovatively investigated the combined influence of individual and social/environmental factors on obesity within the urban landscape of Seoul, by employing advanced machine learning approaches. We collected 'Community Health Surveys' and credit card usage data to represent individual factors. In parallel, we utilized 'Seoul Open Data' to encapsulate social/environmental factors contributing to obesity. A Random Forest model was used to predict obesity based on individual factors. The model was further subjected to Shapley Additive Explanations (SHAP) algorithms to determine each factor's relative importance in obesity prediction. For social/environmental factors, we used the Geographically Weighted Least Absolute Shrinkage and Selection Operator (GWLASSO) to calculate the regression coefficients.

Results: The Random Forest model predicted obesity with an accuracy of >90%. The SHAP revealed diverse influential individual obesity-related factors in each Gu district, although 'self-awareness of obesity', 'weight control experience', and 'high blood pressure experience' were among the top five influential factors across all Gu districts. The GWLASSO indicated variations in regression coefficients between social/environmental factors across different districts.

Conclusion: Our findings provide valuable insights for designing targeted obesity prevention programs that integrate different individual and social/environmental factors within the context of urban design, even within the same city. This study enhances the efficient development and application of explainable machine learning in devising urban health strategies. We recommend that each autonomous district consider these differential influential factors in designing their budget plans to tackle obesity effectively.

Keywords: GWLASSO; SHAP; influential factors; machine learning; obesity.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Algorithms
  • Health Surveys
  • Humans
  • Machine Learning
  • Obesity* / epidemiology
  • Public Health*

Grants and funding

The author(s) declare financial support was received for the research, authorship, and/or publication of this article. This research was partly funded by the National Research Foundation of Korea (NRF) grant funded by the Korea Government (MSIT) (No. NRF-2022R1C1C1010458, Cross Bio-Sensing System for the Future XR Interface, 50%), and Korea Institute of Police Technology (KIPoT) grant funded by the Korean government (KNPA) (No. 092021C26S02000, Development of Transportation Safety Infrastructure Technology for Lb.4 Connected Autonomous Driving, 50%).