Distinct types of short open reading frames are translated in plant cells

Genome Res. 2019 Sep;29(9):1464-1477. doi: 10.1101/gr.253302.119. Epub 2019 Aug 6.

Abstract

Genomes contain millions of short (<100 codons) open reading frames (sORFs), which are usually dismissed during gene annotation. Nevertheless, peptides encoded by such sORFs can play important biological roles, and their impact on cellular processes has long been underestimated. Here, we analyzed approximately 70,000 transcribed sORFs in the model plant Physcomitrella patens (moss). Several distinct classes of sORFs that differ in terms of their position on transcripts and the level of evolutionary conservation are present in the moss genome. Over 5000 sORFs were conserved in at least one of 10 plant species examined. Mass spectrometry analysis of proteomic and peptidomic data sets suggested that tens of sORFs located on distinct parts of mRNAs and long noncoding RNAs (lncRNAs) are translated, including conserved sORFs. Translational analysis of the sORFs and main ORFs at a single locus suggested the existence of genes that code for multiple proteins and peptides with tissue-specific expression. Functional analysis of four lncRNA-encoded peptides showed that sORFs-encoded peptides are involved in regulation of growth and differentiation in moss. Knocking out lncRNA-encoded peptides resulted in a decrease of moss growth. In contrast, the overexpression of these peptides resulted in a diverse range of phenotypic effects. Our results thus open new avenues for discovering novel, biologically active peptides in the plant kingdom.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Bryopsida / genetics
  • Bryopsida / metabolism*
  • Gene Expression Regulation, Developmental
  • Gene Expression Regulation, Plant
  • Genome, Plant
  • Mass Spectrometry
  • Open Reading Frames*
  • Peptides / metabolism
  • Plant Proteins / metabolism
  • Protein Biosynthesis*
  • Proteomics / methods*
  • RNA, Long Noncoding
  • Sequence Analysis, DNA

Substances

  • Peptides
  • Plant Proteins
  • RNA, Long Noncoding