Predicting the risk of GenX contamination in private well water using a machine-learned Bayesian network model

J Hazard Mater. 2021 Jun 5:411:125075. doi: 10.1016/j.jhazmat.2021.125075. Epub 2021 Jan 7.

Abstract

Per- and polyfluoroalkyl substances (PFAS) are emerging contaminants that pose significant challenges in mechanistic fate and transport modeling due to their diverse and complex chemical characteristics. Machine learning provides a novel approach for predicting the spatial distribution of PFAS in the environment. We used spatial location information to link PFAS measurements from 1207 private drinking water wells around a fluorochemical manufacturing facility to a mechanistic model of PFAS air deposition and to publicly available data on soil, land use, topography, weather, and proximity to multiple PFAS sources. We used the resulting linked data set to train a Bayesian network model to predict the risk that GenX, a member of the PFAS class, would exceed a state provisional health goal (140 ng/L) in private well water. The model had high accuracy (ROC curve index for five-fold cross-validation of 0.85, 90% CI 0.84-0.87). Among factors significantly associated with GenX risk in private wells, the most important was the historic rate of atmospheric deposition of GenX from the fluorochemical manufacturing facility. The model output was used to generate spatial risk predictions for the study area to aid in risk assessment, environmental investigations, and targeted public health interventions.

Keywords: Bayesian network; Drinking water; GenX; Machine-learning; PFAS; Well water.