Predicting Hydrocarbon Primary Biodegradation in Soil and Sediment Systems Using System Parameterization and Machine Learning

Environ Toxicol Chem. 2024 Mar 28. doi: 10.1002/etc.5857. Online ahead of print.

Abstract

Technical complexity associated with biodegradation testing, particularly for substances of unknown or variable composition, complex reaction products, or biological materials (UVCB), necessitates the advancement of non-testing methods such as quantitative structure-property relationships (QSPRs). Models for describing the biodegradation of petroleum hydrocarbons (HCs) have been previously developed. A critical limitation of available models is their inability to capture the variability in biodegradation rates associated with variable test systems and environmental conditions. Recently, the Hydrocarbon Biodegradation System Integrated Model (HC-BioSIM) was developed to characterize the biodegradation of HCs in aquatic systems with the inclusion of key test system variables. The present study further expands the HC-BioSIM methodology to soil and sediment systems using a database of 2195 half-life (i.e., degradation time [DT]50) entries for HCs in soil and sediment. Relevance and reliability criteria were defined based on similarity to standard testing guidelines for biodegradation testing and applied to all entries in the database. The HC-BioSIM soil and sediment models significantly outperformed the existing biodegradation HC half-life (BioHCWin) and virtual evaluation of chemical properties and toxicities (VEGA) quantitative Mario Negri Institute for Pharmacological Research (IRFMN) models in soil and sediment. Average errors in predicted DT50s were reduced by up to 6.3- and 8.7-fold for soil and sediment, respectively. No significant bias as a function of HC class, carbon number, or test system parameters was observed. Model diagnostics demonstrated low variability in performance and high consistency of parameter usage/importance and rule structure, supporting the generalizability and stability of the models for application to external data sets. The HC-BioSIM provides improved accuracy of Persistence categorization, with correct classification rates of 83.9%, and 90.6% for soil and sediment, respectively, demonstrating a significant improvement over the existing BioHCWin (70.7% and 58.6%) and VEGA (59.5% and 18.5%) models. Environ Toxicol Chem 2024;00:1-12. © 2024 Concawe. Environmental Toxicology and Chemistry published by Wiley Periodicals LLC on behalf of SETAC.

Keywords: Biodegradation; Hazard/risk assessment; Quantitative structure–activity relationships.

Grants and funding