Comparison of random forest and multiple linear regression to model the mass balance of biosolids from a complex biosolids management area

Water Environ Res. 2022 Jan;94(1):e1668. doi: 10.1002/wer.1668. Epub 2021 Dec 15.

Abstract

The use of biosolids as a soil amendment provides an important alternative to disposal and can improve soil health; however, distribution for water resource recovery facilities (WRRFs) in the United States can be challenging due to decreasing cropland, increased precipitation, variable plant operations, and financial constraints. Although statistical modeling is commonly used in the water sector, machine learning is still an emerging tool and can provide insights to optimize operations. Random forest (RF), a machine learning model, and multiple linear regression (MLR) were used in this study to model the mass balance of biosolids from a complex biosolids management area. The RF model outperformed (R2 = 0.89) the MLR model (R2 = 0.49) and showed that rainfall was a major factor impacting distribution. Storage for dried biosolids would help decouple drying operations from wet weather and increase distribution. This study demonstrated how machine learning can assist in decision-making processes for long-term planning at WRRFs. PRACTITIONER POINTS: Random forest predicted the 7-day average mass balance of biosolids from a complex biosolids management area. Decoupling biosolids drying operations from wet weather was identified as the highest operational priority. Machine learning outperformed multiple linear regression and can be an important tool for the water sector.

Keywords: biosolids; machine learning; multiple linear regression; planning; random forest.

MeSH terms

  • Biosolids
  • Linear Models
  • Machine Learning
  • Soil*
  • Water Resources*

Substances

  • Biosolids
  • Soil