Study of Structure-active Relationship for Inhibitors of HIV-1 Integrase LEDGF/p75 Interaction by Machine Learning Methods

Mol Inform. 2017 Jul;36(7). doi: 10.1002/minf.201600127. Epub 2017 Feb 28.

Abstract

HIV-1 integrase (IN) is a promising target for anti-AIDS therapy, and LEDGF/p75 is proved to enhance the HIV-1 integrase strand transfer activity in vitro. Blocking the interaction between IN and LEDGF/p75 is an effective way to inhibit HIV replication infection. In this work, 274 LEDGF/p75-IN inhibitors were collected as the dataset. Support Vector Machine (SVM), Decision Tree (DT), Function Tree (FT) and Random Forest (RF) were applied to build several computational models for predicting whether a compound is an active or weakly active LEDGF/p75-IN inhibitor. Each compound is represented by MACCS fingerprints and CORINA Symphony descriptors. The prediction accuracies for the test sets of all the models are over 70 %. The best model Model 3B built by FT obtained a prediction accuracy and a Matthews Correlation Coefficient (MCC) of 81.08 % and 0.62 on test set, respectively. We found that the hydrogen bond and hydrophobic interactions are important for the bioactivity of an inhibitor.

Keywords: Classification model; Extended connectivity fingerprints (ECFP_4); HIV-1 integrase (IN) LEDGF/p75 inhibitor; Machine learning method.

MeSH terms

  • Computer Simulation
  • HIV Integrase / chemistry*
  • HIV Integrase / metabolism
  • HIV Integrase Inhibitors / chemistry*
  • Humans
  • Intercellular Signaling Peptides and Proteins / chemistry*
  • Intercellular Signaling Peptides and Proteins / metabolism
  • Machine Learning*
  • Models, Molecular
  • Molecular Conformation
  • Molecular Structure
  • Protein Binding
  • ROC Curve
  • Reproducibility of Results
  • Structure-Activity Relationship

Substances

  • HIV Integrase Inhibitors
  • Intercellular Signaling Peptides and Proteins
  • lens epithelium-derived growth factor
  • HIV Integrase
  • p31 integrase protein, Human immunodeficiency virus 1