Predicting protein-protein interactions based only on sequences information

Proc Natl Acad Sci U S A. 2007 Mar 13;104(11):4337-41. doi: 10.1073/pnas.0607879104. Epub 2007 Mar 5.

Abstract

Protein-protein interactions (PPIs) are central to most biological processes. Although efforts have been devoted to the development of methodology for predicting PPIs and protein interaction networks, the application of most existing methods is limited because they need information about protein homology or the interaction marks of the protein partners. In the present work, we propose a method for PPI prediction using only the information of protein sequences. This method was developed based on a learning algorithm-support vector machine combined with a kernel function and a conjoint triad feature for describing amino acids. More than 16,000 diverse PPI pairs were used to construct the universal model. The prediction ability of our approach is better than that of other sequence-based PPI prediction methods because it is able to predict PPI networks. Different types of PPI networks have been effectively mapped with our method, suggesting that, even with only sequence information, this method could be applied to the exploration of networks for any newly discovered protein with unknown biological relativity. In addition, such supplementary experimental information can enhance the prediction ability of the method.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Algorithms
  • Amino Acids / chemistry
  • Base Sequence
  • Computational Biology / methods*
  • Databases, Protein
  • Humans
  • Programming Languages
  • Protein Interaction Mapping / methods*
  • Proteomics / methods*
  • Reproducibility of Results
  • Sensitivity and Specificity
  • Sequence Analysis, Protein
  • Software

Substances

  • Amino Acids