OpenProt 2.0 builds a path to the functional characterization of alternative proteins

Nucleic Acids Res. 2024 Jan 5;52(D1):D522-D528. doi: 10.1093/nar/gkad1050.

Abstract

The OpenProt proteogenomic resource (https://www.openprot.org/) provides users with a complete and freely accessible set of non-canonical or alternative open reading frames (AltORFs) within the transcriptome of various species, as well as functional annotations of the corresponding protein sequences not found in standard databases. Enhancements in this update are largely the result of user feedback and include the prediction of structure, subcellular localization, and intrinsic disorder, using cutting-edge algorithms based on machine learning techniques. The mass spectrometry pipeline now integrates a machine learning-based peptide rescoring method to improve peptide identification. We continue to help users explore this cryptic proteome by providing OpenCustomDB, a tool that enables users to build their own customized protein databases, and OpenVar, a genomic annotator including genetic variants within AltORFs and protein sequences. A new interface improves the visualization of all functional annotations, including a spectral viewer and the prediction of multicoding genes. All data on OpenProt are freely available and downloadable. Overall, OpenProt continues to establish itself as an important resource for the exploration and study of new proteins.

MeSH terms

  • Amino Acid Sequence
  • Databases, Protein*
  • Genomics
  • Humans
  • Internet
  • Peptides* / genetics
  • Proteome / genetics
  • Proteomics* / methods

Substances

  • Peptides
  • Proteome