Use of a large dataset to develop new models for estimating the sorption of active pharmaceutical ingredients in soils and sediments

J Hazard Mater. 2021 Aug 5:415:125688. doi: 10.1016/j.jhazmat.2021.125688. Epub 2021 Mar 24.

Abstract

Information on the sorption of active pharmaceutical ingredients (APIs) in soils and sediments is needed for assessing the environmental risks of these substances yet these data are unavailable for many APIs in use. Predictive models for estimating sorption could provide a solution. The performance of existing models is, however, often poor and most models do not account for the effects of soil/sediment properties which are known to significantly affect API sorption. Therefore, here, we use a high-quality dataset on the sorption behavior of 54 APIs in 13 soils and sediments to develop new models for estimating sorption coefficients for APIs in soils and sediments using three machine learning approaches (artificial neural network, random forest and support vector machine) and linear regression. A random forest-based model, with chemical and solid descriptors as the input, was the best performing model. Evaluation of this model using an independent sorption dataset from the literature showed that the model was able to predict sorption coefficients of 90% of the test set to within a factor of 10 of the experimental values. This new model could be invaluable in assessing the sorption behavior of molecules that have yet to be tested and in landscape-level risk assessments.

Keywords: Active pharmaceutical ingredients; Environmental fate; Environmental risk assessment; Machine learning; Molecular and solid properties.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Adsorption
  • Geologic Sediments
  • Pharmaceutical Preparations*
  • Soil
  • Soil Pollutants* / analysis

Substances

  • Pharmaceutical Preparations
  • Soil
  • Soil Pollutants