ADME prediction with KNIME: A retrospective contribution to the second "Solubility Challenge"

Gabriela Falcón-Cano; Christophe Molina; Miguel Ángel Cabrera-Pérez

doi:10.5599/admet.979

ADME prediction with KNIME: A retrospective contribution to the second "Solubility Challenge"

ADMET DMPK. 2021 Jul 12;9(3):209-218. doi: 10.5599/admet.979. eCollection 2021.

Authors

Gabriela Falcón-Cano¹, Christophe Molina², Miguel Ángel Cabrera-Pérez^{1

3

4}

Affiliations

¹ Unit of Modelling and Experimental Biopharmaceutics. Centro de Bioactivos Químicos. Universidad Central "Marta Abreu" de las Villas. Santa Clara 54830, Villa Clara, Cuba.
² PIKAÏROS S.A., 31650 Saint Orens de Gameville, France.
³ Department of Pharmacy and Pharmaceutical Technology, University of Valencia, Burjassot 46100, Valencia, Spain.
⁴ Department of Engineering, Area of Pharmacy and Pharmaceutical Technology, Miguel Hernández University, 03550 Sant Joan d'Alacant, Alicante, Spain.

Abstract

Computational models for predicting aqueous solubility from the molecular structure represent a promising strategy from the perspective of drug design and discovery. Since the first "Solubility Challenge", these initiatives have marked the state-of-art of the modelling algorithms used to predict drug solubility. In this regard, the quality of the input experimental data and its influence on model performance has been frequently discussed. In our previous study, we developed a computational model for aqueous solubility based on recursive random forest approaches. The aim of the current commentary is to analyse the performance of this already trained predictive model on the molecules of the second "Solubility Challenge". Even when our training set has inconsistencies related to the pH, solid form and temperature conditions of the solubility measurements, the model was able to predict the two sets from the second "Solubility Challenge" with statistics comparable to those of the top ranked models. Finally, we provided a KNIME automated workflow to predict aqueous solubility of new drug candidates, during the early stages of drug discovery and development, for ensuring the applicability and reproducibility of our model.

Keywords: ADME; KNIME; Quantitative Structure-Property Relationship (QSPR); Random Forest; Second Solubility Challenge; aqueous solubility; machine learning; supervised recursive variable selection.