Full-length codling moth transcriptome atlas revealed by single-molecule real-time sequencing

Genomics. 2022 Mar;114(2):110299. doi: 10.1016/j.ygeno.2022.110299. Epub 2022 Feb 5.

Abstract

Over the past decade, second-generation sequencing (SGS) has been widely used to elucidate the transcriptome across many organisms. However, the full-length (FL) transcripts and alternative splice (AS) isoforms could not be confidently and accurately defined with SGS. Pacific biosciences (PacBio) single-molecule real-time sequencing was conducted to obtain FL transcriptome data in the codling moth. In total, 25,940 high-quality FL isoforms were obtained and clustered to 14,099 nonredundant clusters. Interestingly, nearly 90% of nonredundant PacBio transcripts were novel compared to reference genes. Among them, 3389 transcripts potentially represented novel genes. Additionally, a large number of AS events were discovered, and most of the splice junctions in the PacBio isoforms could be supported by short reads in public datasets. Furthermore, 952 FL lncRNAs and 81 fusion transcripts were identified and validated using RT-PCR analysis. Overall, an atlas of FL transcripts was obtained in the codling moth, which will help provide further insights into the complexity of the transcriptome and facilitate improving genome annotations and functional studies in this insect.

Keywords: Alternative splicing; Cydia pomonella; Full-length transcriptome; Fusion gene; IsoSeq.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Alternative Splicing
  • Animals
  • High-Throughput Nucleotide Sequencing
  • Moths* / genetics
  • Protein Isoforms / genetics
  • Transcriptome*

Substances

  • Protein Isoforms