RNA-seq library preparation for comprehensive transcriptome analysis in cancer cells: The impact of insert size

Genomics. 2021 Nov;113(6):4149-4162. doi: 10.1016/j.ygeno.2021.10.018. Epub 2021 Nov 3.

Abstract

With long reads and high coverage, RNA-seq enables comprehensive transcriptome analysis of cancer cells, provided that optimal length of libraries (and their inserts) is assured, to avoid overlap of paired reads and consequent loss of sequencing data. We assessed TruSeq Stranded library preparation protocols (poly(A) enrichment-PA and rRNA depletion-RD) for the thoroughness of transcriptome analysis of a heterogeneous cancer, acute lymphoblastic leukemia. We applied 2x150PE sequencing, >150 M reads/sample on Illumina NovaSeq6000. We show that PA outperforms RD for the analysis of gene expression and structural aberrations. RD is more suitable for detection of various classes of RNAs, mutations or polymorphisms. We demonstrate that reduced RNA fragmentation time (generating longer inserts) positively affects detection of structural RNA changes, without introducing bias into gene expression analysis. We recommend this modification for all RNA-seq studies utilizing reads longer than 75 nt, aimed to go beyond gene expression analysis and to detect also structural changes.

Keywords: Cancer transcriptome; Insert size; NovaSeq6000; Poly(A) enrichment; RNA-seq; rRNA depletion.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Gene Expression Profiling* / methods
  • Gene Library
  • High-Throughput Nucleotide Sequencing / methods
  • Neoplasms* / genetics
  • RNA, Messenger / metabolism
  • RNA-Seq
  • Sequence Analysis, RNA / methods
  • Transcriptome

Substances

  • RNA, Messenger