Predicting protein phosphorylation from gene expression: top methods from the IMPROVER Species Translation Challenge

Bioinformatics. 2015 Feb 15;31(4):462-70. doi: 10.1093/bioinformatics/btu490. Epub 2014 Jul 23.

Abstract

Motivation: Using gene expression to infer changes in protein phosphorylation levels induced in cells by various stimuli is an outstanding problem. The intra-species protein phosphorylation challenge organized by the IMPROVER consortium provided the framework to identify the best approaches to address this issue.

Results: Rat lung epithelial cells were treated with 52 stimuli, and gene expression and phosphorylation levels were measured. Competing teams used gene expression data from 26 stimuli to develop protein phosphorylation prediction models and were ranked based on prediction performance for the remaining 26 stimuli. Three teams were tied in first place in this challenge achieving a balanced accuracy of about 70%, indicating that gene expression is only moderately predictive of protein phosphorylation. In spite of the similar performance, the approaches used by these three teams, described in detail in this article, were different, with the average number of predictor genes per phosphoprotein used by the teams ranging from 3 to 124. However, a significant overlap of gene signatures between teams was observed for the majority of the proteins considered, while Kyoto Encyclopedia of Genes and Genomes (KEGG) pathways were enriched in the union of the predictor genes of the three teams for multiple proteins.

Availability and implementation: Gene expression and protein phosphorylation data are available from ArrayExpress (E-MTAB-2091). Software implementation of the approach of Teams 49 and 75 are available at http://bioinformaticsprb.med.wayne.edu and http://people.cs.clemson.edu/∼luofeng/sbv.rar, respectively.

Contact: gyanbhanot@gmail.com or luofeng@clemson.edu or atarca@med.wayne.edu

Supplementary information: Supplementary data are available at Bioinformatics online.

Publication types

  • Comparative Study
  • Research Support, N.I.H., Extramural
  • Research Support, Non-U.S. Gov't
  • Research Support, U.S. Gov't, Non-P.H.S.

MeSH terms

  • Algorithms
  • Animals
  • Cells, Cultured
  • Databases, Factual
  • Epithelial Cells / cytology
  • Epithelial Cells / metabolism*
  • Gene Expression Profiling*
  • Gene Expression Regulation
  • Gene Regulatory Networks
  • Humans
  • Lung / cytology
  • Lung / metabolism*
  • Oligonucleotide Array Sequence Analysis
  • Phosphoproteins / metabolism*
  • Phosphorylation
  • Rats
  • Software*
  • Species Specificity
  • Systems Biology / methods*
  • Translational Research, Biomedical

Substances

  • Phosphoproteins