Machine Learning for Evaluating the Cytotoxicity of Mixtures of Nano-TiO2 and Heavy Metals: QSAR Model Apply Random Forest Algorithm after Clustering Analysis

Molecules. 2022 Sep 19;27(18):6125. doi: 10.3390/molecules27186125.

Abstract

With the development and application of nanomaterials, their impact on the environment and organisms has attracted attention. As a common nanomaterial, nano-titanium dioxide (nano-TiO2) has adsorption properties to heavy metals in the environment. Quantitative structure-activity relationship (QSAR) is often used to predict the cytotoxicity of a single substance. However, there is little research on the toxicity of interaction between nanomaterials and other substances. In this study, we exposed human renal cortex proximal tubule epithelial (HK-2) cells to mixtures of eight heavy metals with nano-TiO2, measured absorbance values by CCK-8, and calculated cell viability. PLS and two ensemble learning algorithms are used to build multiple QSAR models for data sets, and the test set R2 is increased from 0.38 to 0.78 and 0.85, and RMSE is decreased from 0.18 to 0.12 and 0.10. After selecting the better random forest algorithm, the K-means clustering algorithm is used to continue to optimize the model, increasing the test set R2 to 0.95 and decreasing the RMSE to 0.08 and 0.06. As a reliable machine algorithm, random forest can be used to predict the toxicity of the mixture of nano-metal oxides and heavy metals. The cluster analysis can effectively improve the stability and predictability of the model, and provide a new idea for the prediction of cytotoxicity model in the future.

Keywords: AdaBoost; QSAR; RF; cluster analysis; cytotoxicity; mixture; quantum mechanics.

MeSH terms

  • Algorithms
  • Cluster Analysis
  • Humans
  • Machine Learning
  • Metals, Heavy* / toxicity
  • Oxides
  • Quantitative Structure-Activity Relationship*
  • Sincalide
  • Titanium

Substances

  • Metals, Heavy
  • Oxides
  • titanium dioxide
  • Titanium
  • Sincalide