Investigating bromide incorporation factor (BIF) and model development for predicting THMs in drinking water using machine learning

Sci Total Environ. 2024 Jan 1:906:167595. doi: 10.1016/j.scitotenv.2023.167595. Epub 2023 Oct 5.

Abstract

Many disinfection byproducts (DBPs) in drinking water can pose cancer risks to humans while several DBPs including trihalomethanes are typically regulated. Although trihalomethanes are regulated, brominated fractions (bromodichloromethane, dibromochloromethane and bromoform) are more toxic to humans than the chlorinated ones (chloroform). To date, >100 models have been reported to predict DBPs. However, models to predict individual trihalomethanes are very limited, indicating the needs of such models. Various factors including natural organic matter (NOM), bromide ions (Br-), disinfectants (e.g., chlorine dose), pH, temperature and reaction time affect the formation and distribution of trihalomethanes in drinking water. In this study, NOM was fractionated into four groups based on the molecular weight (MW) cutoff values and their respective contributions to dissolved organic carbon (DOC), trihalomethanes and bromide incorporation factors (BIF) were investigated. Models were developed for predicting chloroform, bromodichloromethane, dibromochloromethane, bromoform and trihalomethanes. Three machine learning techniques: Support Vector Regressor (SVR), Random Forest Regressor (RFR) and Artificial Neural Networks (ANN) were adopted for training and testing the models. The normalized BIFs were in the ranges of 0.08-0.16 and 0.07-0.15 per mg/L of DOC for pH 6.0 and 8.5 respectively. The BIFs were higher for lower pH and MW values while increase of bromide to chlorine ratios increased BIFs. The models showed excellent predictive performances in training (R2 = 0.889-0.998) and testing (R2 = 0.870-0.988) datasets. The SVR and RFR models showed the best performances with lower RMSE and MAE in most cases. These models can be used to better control different trihalomethanes in drinking water to maintain regulatory compliance, and to minimize the risks to humans.

Keywords: Artificial Neural Networks; Bromide incorporation factor; Machine learning models; NOM fractionation; Random Forest Regressor; Support Vector Regressor.

MeSH terms

  • Bromides / chemistry
  • Chlorine / chemistry
  • Chloroform
  • Disinfectants* / analysis
  • Disinfection
  • Drinking Water*
  • Halogenation
  • Humans
  • Trihalomethanes / analysis
  • Water Pollutants, Chemical* / analysis
  • Water Purification*

Substances

  • Bromides
  • Drinking Water
  • chlorodibromomethane
  • bromoform
  • Chlorine
  • bromodichloromethane
  • Chloroform
  • Water Pollutants, Chemical
  • Disinfectants
  • Trihalomethanes