Long read isoform sequencing reveals hidden transcriptional complexity between cattle subspecies

BMC Genomics. 2023 Mar 13;24(1):108. doi: 10.1186/s12864-023-09212-9.

Abstract

The Iso-Seq method of full-length cDNA sequencing is suitable to quantify differentially expressed genes (DEGs), transcripts (DETs) and transcript usage (DTU). However, the higher cost of Iso-Seq relative to RNA-seq has limited the comparison of both methods. Transcript abundance estimated by RNA-seq and deep Iso-Seq data for fetal liver from two cattle subspecies were compared to evaluate concordance. Inter-sample correlation of gene- and transcript-level abundance was higher within technology than between technologies. Identification of DEGs between the cattle subspecies depended on sequencing method with only 44 genes identified by both that included 6 novel genes annotated by Iso-Seq. There was a pronounced difference between Iso-Seq and RNA-seq results at transcript-level wherein Iso-Seq revealed several magnitudes more transcript abundance and usage differences between subspecies. Factors influencing DEG identification included size selection during Iso-Seq library preparation, average transcript abundance, multi-mapping of RNA-seq reads to the reference genome, and overlapping coordinates of genes. Some DEGs called by RNA-seq alone appear to be sequence duplication artifacts. Among the 44 DEGs identified by both technologies some play a role in immune system, thyroid function and cell growth. Iso-Seq revealed hidden transcriptional complexity in DEGs, DETs and DTU genes between cattle subspecies previously missed by RNA-seq.

Keywords: Alternative splicing; Cattle; Differential Isoform expression; Iso-Seq; Long read sequencing; Multi-mapped reads; RNA-seq; Sequence duplication; Subspecies; Transcriptome.

MeSH terms

  • Alternative Splicing
  • Animals
  • Cattle / genetics
  • Gene Expression Profiling
  • Gene Library
  • Genome*
  • High-Throughput Nucleotide Sequencing / methods
  • Protein Isoforms / genetics
  • RNA-Seq
  • Sequence Analysis, RNA
  • Transcriptome*

Substances

  • Protein Isoforms