Prediction of attachment efficiency using machine learning on a comprehensive database and its validation

Water Res. 2023 Feb 1:229:119429. doi: 10.1016/j.watres.2022.119429. Epub 2022 Nov 25.

Abstract

Colloidal particles can attach to surfaces during transport, but the attachment depends on particle size, hydrodynamics, solid and water chemistry, and particulate matter. The attachment is quantified in filtration theory by measuring attachment or sticking efficiency (Alpha). A comprehensive Alpha database (2538 records) was built from experiments in the literature and used to develop a machine learning (ML) model to predict Alpha. The training (r-squared: 0.86) was performed using two random forests capable of handling missing data. A holdout dataset was used to validate the training (r-squared: 0.98), and the variable importance was explored for training and validation. Finally, an additional validation dataset was built from quartz crystal microbalance experiments using surface-modified polystyrene, poly (methyl methacrylate), and polyethylene. The experiments were performed in the absence or presence of humic acid. Full database regression (r-squared: 0.90) predicted Alpha for the additional validation with an r-squared of 0.23. Nevertheless, when the original database and the additional validation dataset were combined into a new database, both the training (r-squared: 0.95) and validation (r-squared: 0.70) increased. The developed ML model provides a data-driven prediction of Alpha over a big database and evaluates the significance of 22 input variables.

Keywords: Attachment efficiency; Colloid deposition; Machine learning; Missing data.

MeSH terms

  • Databases, Factual
  • Machine Learning*
  • Particle Size
  • Particulate Matter*
  • Quartz Crystal Microbalance Techniques

Substances

  • Particulate Matter