Predicting stream water quality under different urban development pattern scenarios with an interpretable machine learning approach

Sci Total Environ. 2021 Mar 20:761:144057. doi: 10.1016/j.scitotenv.2020.144057. Epub 2020 Dec 14.

Abstract

Urban development pattern significantly impacts stream water quality by influencing pollutant generation, build-up, and wash-off processes. It is thus necessary to understand and predict stream water quality in accordance with different urban development patterns to effectively advise urban growth planning and policies. To do so, we collected pollutant concentration data on nitrate (NO3--N), total phosphate (TP), and Escherichia coli (E. coli) from 1047 sampling stations in the Texas Gulf Region. We utilized a Random Forest (RF) machine learning model to predict stream water quality under four planning scenarios with different urban densities and configurations. SHapley Additive exPlanations (SHAP) was used to prove the importance of urban development pattern in influencing stream water quality. The spatial variations of the impact of these patterns were explored with Geographically Weighted Regression (GWR). SHAP results indicated that Largest Patch Index (LPI), Patch Cohesion Index (COHESION), Splitting Index (SPLIT), and Landscape Division Index (DIVISION) were the most important urban development pattern metrics affecting stream water quality. The spatial variations of such patterns were shown to impact stream water quality depending on pollutants, seasonality, climate, and urbanization level. RF prediction results suggested that high density aggregated development was more effective in reducing TP and NO3--N concentrations than the current sprawl development, but had the potential risk of increasing E. coli pollution in the wet season. The results of this study provide empirical evidence and a potential mechanistic explanation that stream water quality degradation is a consequence of urban sprawl. Lastly, machine learning is a powerful tool for scenario prediction in land use planning to forecast environmental impacts under different urban development pattern scenarios.

Keywords: Landscape metrics; Machine learning; Scenario planning; Urban form; Urban sprawl; Water quality.