Androgen Receptor Binding Category Prediction with Deep Neural Networks and Structure-, Ligand-, and Statistically Based Features

Molecules. 2021 Feb 26;26(5):1285. doi: 10.3390/molecules26051285.

Abstract

Substances that can modify the androgen receptor pathway in humans and animals are entering the environment and food chain with the proven ability to disrupt hormonal systems and leading to toxicity and adverse effects on reproduction, brain development, and prostate cancer, among others. State-of-the-art databases with experimental data of human, chimp, and rat effects by chemicals have been used to build machine-learning classifiers and regressors and to evaluate these on independent sets. Different featurizations, algorithms, and protein structures lead to different results, with deep neural networks (DNNs) on user-defined physicochemically relevant features developed for this work outperforming graph convolutional, random forest, and large featurizations. The results show that these user-provided structure-, ligand-, and statistically based features and specific DNNs provided the best results as determined by AUC (0.87), MCC (0.47), and other metrics and by their interpretability and chemical meaning of the descriptors/features. In addition, the same features in the DNN method performed better than in a multivariate logistic model: validation MCC = 0.468 and training MCC = 0.868 for the present work compared to evaluation set MCC = 0.2036 and training set MCC = 0.5364 for the multivariate logistic regression on the full, unbalanced set. Techniques of this type may improve AR and toxicity description and prediction, improving assessment and design of compounds. Source code and data are available on github.

Keywords: androgen receptor; artificial intelligence; deep neural network; machine learning; random forest.

MeSH terms

  • Algorithms
  • Animals
  • Deep Learning*
  • Humans
  • Ligands
  • Logistic Models
  • Neural Networks, Computer
  • Protein Binding / genetics*
  • Proteins / genetics*
  • Rats
  • Receptors, Androgen / genetics*
  • Software

Substances

  • AR protein, human
  • Ligands
  • Proteins
  • Receptors, Androgen