Development of GBRT Model as a Novel and Robust Mathematical Model to Predict and Optimize the Solubility of Decitabine as an Anti-Cancer Drug

Molecules. 2022 Sep 2;27(17):5676. doi: 10.3390/molecules27175676.

Abstract

The efficient production of solid-dosage oral formulations using eco-friendly supercritical solvents is known as a breakthrough technology towards developing cost-effective therapeutic drugs. Drug solubility is a significant parameter which must be measured before designing the process. Decitabine belongs to the antimetabolite class of chemotherapy agents applied for the treatment of patients with myelodysplastic syndrome (MDS). In recent years, the prediction of drug solubility by applying mathematical models through artificial intelligence (AI) has become known as an interesting topic due to the high cost of experimental investigations. The purpose of this study is to develop various machine-learning-based models to estimate the optimum solubility of the anti-cancer drug decitabine, to evaluate the effects of pressure and temperature on it. To make models on a small dataset in this research, we used three ensemble methods, Random Forest (RFR), Extra Tree (ETR), and Gradient Boosted Regression Trees (GBRT). Different configurations were tested, and optimal hyper-parameters were found. Then, the final models were assessed using standard metrics. RFR, ETR, and GBRT had R2 scores of 0.925, 0.999, and 0.999, respectively. Furthermore, the MAPE metric error rates were 1.423 × 10-1 7.573 × 10-2, and 7.119 × 10-2, respectively. According to these facts, GBRT was considered as the primary model in this paper. Using this method, the optimal amounts are calculated as: P = 380.88 bar, T = 333.01 K, Y = 0.001073.

Keywords: anti-cancer drug; artificial intelligence; optimization; simulation.

MeSH terms

  • Antineoplastic Agents* / pharmacology
  • Artificial Intelligence*
  • Decitabine
  • Humans
  • Models, Theoretical
  • Solubility

Substances

  • Antineoplastic Agents
  • Decitabine

Grants and funding

This research was funded by Princess Nourah bint Abdulrahman University Researchers Supporting Project number (PNURSP2022R99), Princess Nourah bint Abdulrahman University, Riyadh, Saudi Arabia. The author would like to thank the Deanship of Scientific Research at Shaqra University for supporting this work. The authors extend their appreciation to the Deanship of Scientific Research at King Khalid University for funding this work through Large Groups Project under grant number (RGP.2/50/43). The authors would like to thank the Deanship of Scientific Research at Umm Al-Qura University for supporting this work by Grant Code: 22UQU4290565DSR61.