CLT-seq as a universal homopolymer-sequencing concept reveals poly(A)-tail-tuned ncRNA regulation

Brief Bioinform. 2023 Sep 22;24(6):bbad374. doi: 10.1093/bib/bbad374.

Abstract

Dynamic tuning of the poly(A) tail is a crucial mechanism for controlling translation and stability of eukaryotic mRNA. Achieving a comprehensive understanding of how this regulation occurs requires unbiased abundance quantification of poly(A)-tail transcripts and simple poly(A)-length measurement using high-throughput sequencing platforms. Current methods have limitations due to complicated setups and elaborate library preparation plans. To address this, we introduce central limit theorem (CLT)-managed RNA-seq (CLT-seq), a simple and straightforward homopolymer-sequencing method. In CLT-seq, an anchor-free oligo(dT) primer rapidly binds to and unbinds from anywhere along the poly(A) tail string, leading to position-directed reverse transcription with equal probability. The CLT mechanism enables the synthesized poly(T) lengths, which correspond to the templated segment of the poly(A) tail, to distribute normally. Based on a well-fitted pseudogaussian-derived poly(A)-poly(T) conversion model, the actual poly(A)-tail profile is reconstructed from the acquired poly(T)-length profile through matrix operations. CLT-seq follows a simple procedure without requiring RNA-related pre-treatment, enrichment or selection, and the CLT-shortened poly(T) stretches are more compatible with existing sequencing platforms. This proof-of-concept approach facilitates direct homopolymer base-calling and features unbiased RNA-seq. Therefore, CLT-seq provides unbiased, robust and cost-efficient transcriptome-wide poly(A)-tail profiling. We demonstrate that CLT-seq on the most common Illumina platform delivers reliable poly(A)-tail profiling at a transcriptome-wide scale in human cellular contexts. We find that the poly(A)-tail-tuned ncRNA regulation undergoes a dynamic, complex process similar to mRNA regulation. Overall, CLT-seq offers a simplified, effective and economical approach to investigate poly(A)-tail regulation, with potential implications for understanding gene expression and identifying therapeutic targets.

Keywords: CLT-seq; ncRNA; poly(A) tail; poly(A)-poly(T) conversion; pseudogaussian distribution function.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Gene Expression Profiling*
  • High-Throughput Nucleotide Sequencing / methods
  • Humans
  • Polyadenylation*
  • RNA, Messenger / genetics
  • RNA, Untranslated / genetics
  • RNA, Untranslated / metabolism
  • Sequence Analysis, RNA / methods
  • Transcriptome

Substances

  • RNA, Messenger
  • RNA, Untranslated