Background: Abnormal activation of human nuclear hormone receptors disrupts endocrine systems and thereby affects human health. There have been machine learning-based models to predict androgen receptor agonist activity. However, the models were constructed based on limited numerical features such as molecular descriptors and fingerprints.
Result: In this study, instead of the numerical features, 2-D chemical structure images of compounds were used to build an androgen receptor toxicity prediction model. The images may provide unknown features that were not represented by conventional numerical features. As a result, the new strategy resulted in a construction of highly accurate prediction model: Mathews correlation coefficient (MCC) of 0.688, positive predictive value (PPV) of 0.933, sensitivity of 0.519, specificity of 0.998, and overall accuracy of 0.981 in 10-fold cross-validation. Validation on a test dataset showed MCC of 0.370, sensitivity of 0.211, specificity of 0.991, PPV of 0.882, and overall accuracy of 0.801. Our chemical image-based prediction model outperforms conventional models based on numerical features.
Conclusion: Our constructed prediction model successfully classified molecular images into androgen receptor agonists or inactive compounds. The result indicates that 2-D molecular mimetic diagram would be used as another feature to construct molecular activity prediction models.
Keywords: Androgen receptor toxicity; Chemical compound images; Convolutional neural network.