Screening priority pesticides for drinking water quality regulation and monitoring by machine learning: Analysis of factors affecting detectability

J Environ Manage. 2023 Jan 15;326(Pt A):116738. doi: 10.1016/j.jenvman.2022.116738. Epub 2022 Nov 11.

Abstract

Proper selection of new contaminants to be regulated or monitored prior to implementation is an important issue for regulators and water supply utilities. Herein, we constructed and evaluated machine learning models for predicting the detectability (detection/non-detection) of pesticides in surface water as drinking water sources. Classification and regression models were constructed for Random Forest, XGBoost, and LightGBM, respectively; of these, the LightGBM classification model had the highest prediction accuracy. Furthermore, its prediction performance was superior in all aspects of Recall, Precision, and F-measure compared to the detectability index method, which is based on runoff models from previous studies. Regardless of the type of machine learning model, the number of annual measurements, sales quantity of pesticide for rice-paddy field, and water quality guideline values were the most important model features (explanatory variables). Analysis of the impact of the features suggested the presence of a threshold (or range), above which the detectability increased. In addition, if a feature (e.g., quantity of pesticide sales) acted to increase the likelihood of detection beyond a threshold value, other features also synergistically affected detectability. Proportion of false positives and negatives varied depending on the features used. The superiority of the machine learning models is their ability to represent nonlinear and complex relationships between features and pesticide detectability that cannot be represented by existing risk scoring methods.

Keywords: Drinking water source; Guideline; Monitoring; Pesticide concentration; Prediction; Screening.

MeSH terms

  • Drinking Water* / analysis
  • Environmental Monitoring
  • Machine Learning
  • Pesticides* / analysis
  • Water Pollutants, Chemical* / analysis
  • Water Quality

Substances

  • Pesticides
  • Drinking Water
  • Water Pollutants, Chemical