Alternative ORFs and small ORFs: shedding light on the dark proteome

Nucleic Acids Res. 2020 Feb 20;48(3):1029-1042. doi: 10.1093/nar/gkz734.

Abstract

Traditional annotation of protein-encoding genes relied on assumptions, such as one open reading frame (ORF) encodes one protein and minimal lengths for translated proteins. With the serendipitous discoveries of translated ORFs encoded upstream and downstream of annotated ORFs, from alternative start sites nested within annotated ORFs and from RNAs previously considered noncoding, it is becoming clear that these initial assumptions are incorrect. The findings have led to the realization that genetic information is more densely coded and that the proteome is more complex than previously anticipated. As such, interest in the identification and characterization of the previously ignored 'dark proteome' is increasing, though we note that research in eukaryotes and bacteria has largely progressed in isolation. To bridge this gap and illustrate exciting findings emerging from studies of the dark proteome, we highlight recent advances in both eukaryotic and bacterial cells. We discuss progress in the detection of alternative ORFs as well as in the understanding of functions and the regulation of their expression and posit questions for future work.

Publication types

  • Research Support, N.I.H., Extramural
  • Review

MeSH terms

  • Disease / genetics
  • Gene Expression Regulation*
  • Gene Expression Regulation, Bacterial
  • Humans
  • Membrane Fusion
  • Membrane Proteins / metabolism
  • Open Reading Frames*
  • Peptide Chain Initiation, Translational*
  • Protein Biosynthesis
  • Proteins / physiology
  • Proteome / genetics*
  • Transcription, Genetic

Substances

  • Membrane Proteins
  • Proteins
  • Proteome