Integrative profiling of Epstein-Barr virus transcriptome using a multiplatform approach

Virol J. 2022 Jan 6;19(1):7. doi: 10.1186/s12985-021-01734-6.

Abstract

Background: Epstein-Barr virus (EBV) is an important human pathogenic gammaherpesvirus with carcinogenic potential. The EBV transcriptome has previously been analyzed using both Illumina-based short read-sequencing and Pacific Biosciences RS II-based long-read sequencing technologies. Since the various sequencing methods have distinct strengths and limitations, the use of multiplatform approaches have proven to be valuable. The aim of this study is to provide a more complete picture on the transcriptomic architecture of EBV.

Methods: In this work, we apply the Oxford Nanopore Technologies MinION (long-read sequencing) platform for the generation of novel transcriptomic data, and integrate these with other's data generated by another LRS approach, Pacific BioSciences RSII sequencing and Illumina CAGE-Seq and Poly(A)-Seq approaches. Both amplified and non-amplified cDNA sequencings were applied for the generation of sequencing reads, including both oligo-d(T) and random oligonucleotide-primed reverse transcription. EBV transcripts are identified and annotated using the LoRTIA software suite developed in our laboratory.

Results: This study detected novel genes embedded into longer host genes containing 5'-truncated in-frame open reading frames, which potentially encode N-terminally truncated proteins. We also detected a number of novel non-coding RNAs and transcript length isoforms encoded by the same genes but differing in their start and/or end sites. This study also reports the discovery of novel splice isoforms, many of which may represent altered coding potential, and of novel replication-origin-associated transcripts. Additionally, novel mono- and multigenic transcripts were identified. An intricate meshwork of transcriptional overlaps was revealed.

Conclusions: An integrative approach applying multi-technique sequencing technologies is suitable for reliable identification of complex transcriptomes because each techniques has different advantages and limitations, and the they can be used for the validation of the results obtained by a particular approach.

Keywords: Epstein–Barr virus; Herpesvirus; Long-read sequencing; Nanopore sequencing; PacBio sequencing; Splice variant; Transcript isoform; Transcription end site; Transcription start site; Transcriptome.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Epstein-Barr Virus Infections* / genetics
  • Gene Expression Profiling
  • Herpesvirus 4, Human / genetics
  • High-Throughput Nucleotide Sequencing / methods
  • Humans
  • Open Reading Frames
  • Transcriptome*