A novel proteochemometrics model for predicting the inhibition of nine carbonic anhydrase isoforms based on supervised Laplacian score and k-nearest neighbour regression

SAR QSAR Environ Res. 2018 Jun;29(6):419-437. doi: 10.1080/1062936X.2018.1447995. Epub 2018 Jun 8.

Abstract

Carbonic anhydrases (CAs) are essential enzymes in biological processes. Prediction of the activity of compounds towards CA isoforms could be evaluated by computational techniques to discover a novel therapeutic inhibitor. Studies such as quantitative structure-activity relationships (QSARs), molecular docking and pharmacophore modelling have been carried out to design potent inhibitors. Unfortunately, QSAR does not consider the information of target space in the model. We successfully developed an in silico proteochemometrics model that simultaneously uses target and ligand descriptors to predict the activities of CA inhibitors. Herein, a strong predictive model was built for the prediction of protein-ligand binding affinity between nine human CA isoforms and 549 ligands. We applied descriptors obtained from the PROFEAT webserver for the proteins. Ligands were encoded by descriptors from PaDEL-Descriptor software. Supervised Laplacian score (SLS) and particle swarm optimization were used for feature selection. Models were derived using k-nearest neighbour (KNN) regression and a kernel smoother model. The predictive ability of the models was evaluated by an external validation test. Statistical results (Q2ext = 0.7806, r2test = 0.7811 and RMSEtest = 0.5549) showed that the model generated using SLS and KNN regression outperformed the other models. Consequently, the selectivity of compounds towards these enzymes will be predicted prior to synthesis.

Keywords: Carbonic anhydrase isoforms; cancer; k-nearest neighbour; proteochemometrics; regression.

MeSH terms

  • Carbonic Anhydrases / chemistry*
  • Humans
  • Isoenzymes / chemistry
  • Ligands
  • Models, Chemical*
  • Models, Molecular
  • Structure-Activity Relationship

Substances

  • Isoenzymes
  • Ligands
  • Carbonic Anhydrases