An interpretable machine learning model for selectivity of small-molecules against homologous protein family

Future Med Chem. 2022 Oct;14(20):1441-1453. doi: 10.4155/fmc-2022-0075. Epub 2022 Sep 28.

Abstract

Aim: In the early stages of drug discovery, various experimental and computational methods are used to measure the specificity of small molecules against a target protein. The selectivity of small molecules remains a challenge leading to off-target side effects. Methods: We have developed a multitask deep learning model for predicting the selectivity on closely related homologs of the target protein. The model has been tested on the Janus-activated kinase and dopamine receptor families of proteins. Results & conclusion: The feature-based representation (extended connectivity fingerprint 4) with Extreme Gradient Boosting performed better when compared with deep neural network models in most of the evaluation metrics. Both the Extreme Gradient Boosting and deep neural network models outperformed the graph-based models. Furthermore, to decipher the model decision on selectivity, the important fragments associated with each homologous protein were identified.

Keywords: SHAP values; explainable models; machine learning; multitask models; selectivity.

MeSH terms

  • Drug Discovery / methods
  • Machine Learning*
  • Neural Networks, Computer*
  • Proteins
  • Receptors, Dopamine

Substances

  • Proteins
  • Receptors, Dopamine