An improved method for predicting interactions between virus and human proteins

J Bioinform Comput Biol. 2017 Feb;15(1):1650024. doi: 10.1142/S0219720016500244. Epub 2016 Jul 10.

Abstract

The interaction of virus proteins with host proteins plays a key role in viral infection and consequent pathogenesis. Many computational methods have been proposed to predict protein-protein interactions (PPIs), but most of the computational methods are intended for PPIs within a species rather than PPIs across different species such as virus-host PPIs. We developed a method that represents key features of virus and human proteins of variable length into a feature vector of fixed length. The key features include the relative frequency of amino acid triplets (RFAT), the frequency difference of amino acid triplets (FDAT) between virus and host proteins, and amino acid composition (AC). We constructed several support vector machine (SVM) models to evaluate our method and to compare our method with others on PPIs between human and two types of viruses: human papillomaviruses (HPV) and hepatitis C virus (HCV). Comparison of our method to others with same datasets of HPV-human PPIs and HCV-human PPIs showed that the performance of our method is significantly higher than others in all performance measures. Using the SVM model with gene ontology (GO) annotations of proteins, we predicted new HPV-human PPIs. We believe our approach will be useful in predicting heterogeneous PPIs.

Keywords: Protein–protein interactions; hepatitis C virus; human papillomaviruses.

MeSH terms

  • Alphapapillomavirus / metabolism
  • Amino Acids / analysis
  • Amino Acids / metabolism
  • Computational Biology / methods*
  • Databases, Protein
  • Gene Ontology
  • Hepacivirus / metabolism
  • Host-Pathogen Interactions
  • Humans
  • Protein Interaction Mapping / methods*
  • Reproducibility of Results
  • Support Vector Machine*
  • Viral Proteins / metabolism*

Substances

  • Amino Acids
  • Viral Proteins