Applying Machine Learning to investigate metal isotope variations at the watershed scale: A case study with lithium isotopes across the Yukon River Basin

Sci Total Environ. 2023 Oct 20:896:165165. doi: 10.1016/j.scitotenv.2023.165165. Epub 2023 Jun 30.

Abstract

Constraining the multiple climatic, lithological, topographic, and geochemical variables controlling isotope variations in large rivers is often challenging with standard statistical methods. Machine learning (ML) is an efficient method for analyzing multidimensional datasets, resolving correlated processes, and exploring relationships between variables simultaneously. We tested four ML algorithms to elucidate the controls of riverine δ7Li variations across the Yukon River Basin (YRB). We compiled (n = 102) and analyzed new samples (n = 21), producing a dataset of 123 river water samples collected across the basin during the summer including δ7Li and extracted environmental, climatological, and geological characteristics of the drainage area for each sample from open-access geospatial databases. The ML models were trained, tuned, and tested under multiple scenarios to avoid issues such as overfitting. Random Forests (RF) performed best at predicting δ7Li across the basin, with the median model explaining 62 % of the variance. The most important variables controlling δ7Li across the basin are elevation, lithology, and past glacial coverage, which ultimately influence weathering congruence. Riverine δ7Li has a negative dependence on elevation. This reflects congruent weathering in kinetically-limited mountain zones with short residence times. The consistent ranking of lithology, specifically igneous and metamorphic rock cover, as a top feature controlling riverine δ7Li modeled by the RFs is unexpected. Further study is required to validate this finding. Rivers draining areas that were extensively covered during the last glacial maximum tend to have lower δ7Li due to immature weathering profiles resulting in short residence times, less secondary mineral formation and therefore more congruent weathering. We demonstrate that ML provides a fast, simple, visualizable, and interpretable approach for disentangling key controls of isotope variations in river water. We assert that ML should become a routine tool, and present a framework for applying ML to analyze spatial metal isotope data at the catchment scale.

Keywords: Lithium isotopes; Machine learning; Partial dependence plots; Permutation feature importance; Random forests; Yukon River.