Prediction and interpretation of antibiotic-resistance genes occurrence at recreational beaches using machine learning models

J Environ Manage. 2023 Feb 15:328:116969. doi: 10.1016/j.jenvman.2022.116969. Epub 2022 Dec 7.

Abstract

Antibiotic-resistant bacteria and antibiotic resistance genes (ARGs) are pollutants of worldwide concern that seriously threaten public health and ecosystems. Machine learning (ML) prediction models have been applied to predict ARGs in beach waters. However, the existing studies were conducted at a single location and had low prediction performance. Moreover, ML models are "black boxes" that do not reveal their predictions' internal nuances and mechanisms. This lack of transparency and trust can result in serious consequences when using these models in high-stakes decisions. In this study, we developed a gradient boosted regression tree based (GBRT) ML model and then described its behavior using six explainable artificial intelligence (XAI) model-agnostic explanation methods. We used hydro-meteorological and qPCR data from the beaches in South Korea and Pakistan and developed ML prediction models for aac (6'-lb-cr), sul1, and tetX with 10-fold time-blocked cross-validation performances of 4.9, 2.06 and 4.4 root mean squared logarithmic error, respectively. We then analyzed the local and global behavior of the developed ML model using four interpretation methods. The developed ML models showed that water temperature, precipitation and tide are the most important predictors for prediction of ARGs at recreational beaches. We show that the model-agnostic interpretation methods not only explain the behavior of the ML model but also provide insights into the behavior of the ML model under new unseen conditions. Moreover, these post-processing techniques can be a debugging tool for ML-based modeling.

Keywords: Antibiotic resistance genes; Artificial intelligence; Black box models; Explainable; Machine learning; Recreational beaches.

MeSH terms

  • Artificial Intelligence*
  • Bacteria / genetics
  • Drug Resistance, Microbial / genetics
  • Ecosystem*
  • Machine Learning