Algal community structure prediction by machine learning

Environ Sci Ecotechnol. 2022 Dec 30:14:100233. doi: 10.1016/j.ese.2022.100233. eCollection 2023 Apr.

Abstract

The algal community structure is vital for aquatic management. However, the complicated environmental and biological processes make modeling challenging. To cope with this difficulty, we investigated using random forests (RF) to predict phytoplankton community shifting based on multi-source environmental factors (including physicochemical, hydrological, and meteorological variables). The RF models robustly predicted the algal communities composed by 13 major classes (Bray-Curtis dissimilarity = 9.2 ± 7.0%, validation NRMSE mostly <10%), with accurate simulations to the total biomass (validation R2 > 0.74) in Norway's largest lake, Lake Mjosa. The importance analysis showed that the hydro-meteorological variables (Standardized MSE and Node Purity mostly >0.5) were the most influential factors in regulating the phytoplankton. Furthermore, an in-depth ecological interpretation uncovered the interactive stress-response effect on the algal community learned by the RF models. The interpretation results disclosed that the environmental drivers (i.e., temperature, lake inflow, and nutrients) can jointly pose strong influence on the algal community shifts. This study highlighted the power of machine learning in predicting complex algal community structures and provided insights into the model interpretability.

Keywords: Environmental driver; Hydrology; Meteorology; Model interpretability; Phytoplankton community; Random forests.