Detecting protein-protein interactions with a novel matrix-based protein sequence representation and support vector machines

Biomed Res Int. 2015:2015:867516. doi: 10.1155/2015/867516. Epub 2015 Apr 27.

Abstract

Proteins and their interactions lie at the heart of most underlying biological processes. Consequently, correct detection of protein-protein interactions (PPIs) is of fundamental importance to understand the molecular mechanisms in biological systems. Although the convenience brought by high-throughput experiment in technological advances makes it possible to detect a large amount of PPIs, the data generated through these methods is unreliable and may not be completely inclusive of all possible PPIs. Targeting at this problem, this study develops a novel computational approach to effectively detect the protein interactions. This approach is proposed based on a novel matrix-based representation of protein sequence combined with the algorithm of support vector machine (SVM), which fully considers the sequence order and dipeptide information of the protein primary sequence. When performed on yeast PPIs datasets, the proposed method can reach 90.06% prediction accuracy with 94.37% specificity at the sensitivity of 85.74%, indicating that this predictor is a useful tool to predict PPIs. Achieved results also demonstrate that our approach can be a helpful supplement for the interactions that have been detected experimentally.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Amino Acid Sequence / genetics*
  • Computational Biology*
  • Databases, Protein
  • Helicobacter pylori / genetics
  • Humans
  • Protein Interaction Mapping / methods*
  • Proteins / genetics*
  • Saccharomyces cerevisiae
  • Sequence Analysis, Protein
  • Support Vector Machine

Substances

  • Proteins