Multi-task machine learning models for simultaneous prediction of tissue-to-blood partition coefficients of chemicals in mammals

Environ Res. 2024 Jan 15:241:117603. doi: 10.1016/j.envres.2023.117603. Epub 2023 Nov 7.

Abstract

Tissue-to-blood partition coefficients (Ptb) are crucial for assessing the distribution of chemicals in organisms. Given the lack of experimental data and laborious nature of experimental methods, there is an urgent need to develop efficient predictive models. With the help of machine learning algorithms, i,e., random forest (RF), and artificial neural network (ANN), this study developed multi-task (MT) models that can simultaneously predict Ptb values for various mammalian tissues, including liver, muscle, brain, lung, and adipose. Single-task (ST) models using partial least squares regression, RF, and ANN algorithms for each endpoint were established for comparison. Overall, the performances of MT models were superior to those of ST models. The MT model using ANN algorithms showed the highest prediction accuracy with determination coefficients ranging from 0.704 to 0.886, root mean square errors between 0.223 and 0.410, and mean absolute errors ranging from 0.178 to 0.285 log units. Results showed that lipophilicity and polarizability of molecules significantly influence their partition behavior in organisms. Applicability domains (ADs) of the models were characterized by weighted molecular similarity density, and weighted inconsistency in molecular activities of structure-activity landscapes. When constrained by ADs, the models displayed enhanced predictive accuracy, making them valuable tools for the risk assessment and management of chemicals.

Keywords: Applicability domain; Artificial neural network; Multi-task learning; Random forest; Risk assessment; Tissue-to-blood partition coefficient.

MeSH terms

  • Algorithms*
  • Animals
  • Liver
  • Machine Learning
  • Mammals
  • Neural Networks, Computer*