Large Dataset-Based Regression Model of Chemical Toxicity to Vibrio fischeri

Arch Environ Contam Toxicol. 2023 Jul;85(1):46-54. doi: 10.1007/s00244-023-01010-4. Epub 2023 Jul 5.

Abstract

For the first time, a global regression quantitative structure-toxicity/activity relationship (QSTR/QSAR) model was developed for the toxicity of a large data set including 1236 chemicals towards Vibrio fischeri, by using random forest (RF) regression algorithm. The optimal RF model with RF parameters of mtry = 3, ntree = 150 and nodesize = 5 was based on 13 molecular descriptors. It can achieve accurate prediction for the toxicity of 99.1% of 1236 chemicals, and yield coefficients of determination R2 of 0.893 for 930 log(Mw/IBC50) in the training set, 0.723 for 306 log(Mw/IBC50) in the test se, and 0.865 for 1236 toxicity log(Mw/IBC50) in the total set. The optimal RF global model proposed in this work is comparable to other published local QSTR models on small datasets of the toxicity to Vibrio fischeri.

MeSH terms

  • Aliivibrio fischeri*
  • Quantitative Structure-Activity Relationship*
  • Random Forest