Globally predicting protein functions based on co-expressed protein-protein interaction networks and ontology taxonomy similarities

Gene. 2007 Apr 15;391(1-2):113-9. doi: 10.1016/j.gene.2006.12.008. Epub 2006 Dec 22.

Abstract

Determining protein functions is an important task in the post-genomic era. Most of the current methods work on some large-sized functional classes selected from functional categorization systems prior to the prediction processes. GESTs, a prediction approach previously proposed by us, is based on gene expression similarity and taxonomy similarity of the functional classes. Unlike many conventional methods, it does not require pre-selecting the functional classes and can predict specific functions for genes according to the functional annotations of their co-expressed genes. In this paper, we extend this method for analyzing protein-protein interaction data. We introduce gene expression data to filter the interacting neighbors of a protein in order to enhance the degree of functional consensus among the neighbors. Using the taxonomy similarity of protein functional classes, the proposed approach can call on the interacting neighbor proteins annotated to nearby classes to support the predictions for an uncharacterized protein, and automatically select the most appropriate small-sized specific functional classes in Gene Ontology (GO) during the learning process. By three measures particularly designed for the functional classes organized in GO, we evaluate the effects of using different taxonomy similarity scores on the prediction performance. Based on the yeast protein-protein interaction data from MIPS and a dataset of gene expression profiles, we show that this method is powerful for predicting protein function to very specific terms. Compared with the other two taxonomy similarity measures used in this study, if we want to achieve higher prediction accuracy with an acceptable specific level (predicted depth), SB-TS measure proposed by us is a reasonable choice for ontology-based functional predictions.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Algorithms*
  • Computational Biology / methods*
  • Gene Expression Profiling
  • Protein Interaction Mapping / methods*
  • Proteins / genetics
  • Proteins / metabolism*
  • Proteins / physiology
  • Reproducibility of Results

Substances

  • Proteins