ProTInSeq: transposon insertion tracking by ultra-deep DNA sequencing to identify translated large and small ORFs

Nat Commun. 2024 Mar 7;15(1):2091. doi: 10.1038/s41467-024-46112-2.

Abstract

Identifying open reading frames (ORFs) being translated is not a trivial task. ProTInSeq is a technique designed to characterize proteomes by sequencing transposon insertions engineered to express a selection marker when they occur in-frame within a protein-coding gene. In the bacterium Mycoplasma pneumoniae, ProTInSeq identifies 83% of its annotated proteins, along with 5 proteins and 153 small ORF-encoded proteins (SEPs; ≤100 aa) that were not previously annotated. Moreover, ProTInSeq can be utilized for detecting translational noise, as well as for relative quantification and transmembrane topology estimation of fitness and non-essential proteins. By integrating various identification approaches, the number of initially annotated SEPs in this bacterium increases from 27 to 329, with a quarter of them predicted to possess antimicrobial potential. Herein, we describe a methodology complementary to Ribo-Seq and mass spectroscopy that can identify SEPs while providing other insights in a proteome with a flexible and cost-effective DNA ultra-deep sequencing approach.

MeSH terms

  • Bacteria* / genetics
  • Base Sequence
  • DNA
  • Open Reading Frames / genetics
  • Proteome* / genetics
  • Sequence Analysis, DNA

Substances

  • Proteome
  • DNA