A proteogenomics approach integrating proteomics and ribosome profiling increases the efficiency of protein identification and enables the discovery of alternative translation start sites

Proteomics. 2014 Dec;14(23-24):2688-98. doi: 10.1002/pmic.201400180. Epub 2014 Oct 2.

Abstract

Next-generation transcriptome sequencing is increasingly integrated with MS to enhance MS-based protein and peptide identification. Recently, a breakthrough in transcriptome analysis was achieved with the development of ribosome profiling (ribo-seq). This technology is based on the deep sequencing of ribosome-protected mRNA fragments, thereby enabling the direct observation of in vivo protein synthesis at the transcript level. In order to explore the impact of a ribo-seq-derived protein sequence search space on MS/MS spectrum identification, we performed a comprehensive proteome study on a human cancer cell line, using both shotgun and N-terminal proteomics, next to ribosome profiling, which was used to delineate (alternative) translational reading frames. By including protein-level evidence of sample-specific genetic variation and alternative translation, this strategy improved the identification score of 69 proteins and identified 22 new proteins in the shotgun experiment. Furthermore, we discovered 18 new alternative translation start sites in the N-terminal proteomics data and observed a correlation between the quantitative measures of ribo-seq and shotgun proteomics with a Pearson correlation coefficient ranging from 0.483 to 0.664. Overall, this study demonstrated the benefits of ribosome profiling for MS-based protein and peptide identification and we believe this approach could develop into a common practice for next-generation proteomics.

Keywords: Bioinformatics; N-terminomics; Proteogenomics; Ribosome profiling; Translation initiation.

MeSH terms

  • Computational Biology / methods*
  • HCT116 Cells
  • Humans
  • Protein Biosynthesis / genetics
  • Proteins / genetics
  • Proteins / metabolism*
  • Proteomics / methods*
  • Ribosomes / metabolism*
  • Tandem Mass Spectrometry

Substances

  • Proteins