PacBio single-molecule long-read sequencing shed new light on the transcripts and splice isoforms of the perennial ryegrass

Mol Genet Genomics. 2020 Mar;295(2):475-489. doi: 10.1007/s00438-019-01635-y. Epub 2020 Jan 1.

Abstract

Perennial ryegrass (Lolium perenne), one of the most widely used forage and cool-season turfgrass worldwide, has a breeding history of more than 100 years. However, the current draft genome annotation and transcriptome characterization are incomplete mainly because of the enormous difficulty in obtaining full-length transcripts. To explore the complete structure of the mRNA and improve the current draft genome, we performed PacBio single-molecule long-read sequencing for full-length transcriptome sequencing in perennial ryegrass. We generated 29,175 high-confidence non-redundant transcripts from 15,893 genetic loci, among which more than 66.88% of transcripts and 24.99% of genetic loci were not previously annotated in the current reference genome. The re-annotated 18,327 transcripts enriched the reference transcriptome. Particularly, 6709 alternative splicing events and 23,789 alternative polyadenylation sites were detected, providing a comprehensive landscape of the post-transcriptional regulation network. Furthermore, we identified 218 long non-coding RNAs and 478 fusion genes. Finally, the transcriptional regulation mechanism of perennial ryegrass in response to drought stress based on the newly updated reference transcriptome sequences was explored, providing new information on the underlying transcriptional regulation network. Taken together, we analyzed the full-length transcriptome of perennial ryegrass by PacBio single-molecule long-read sequencing. These results improve our understanding of the perennial ryegrass transcriptomes and refined the annotation of the reference genome.

Keywords: Alternative polyadenylation events; Alternative splicing events; PacBio single-molecule long-read sequencing; Perennial ryegrass; Reference genome annotation.

MeSH terms

  • Alternative Splicing / genetics*
  • Gene Expression Regulation, Plant / genetics
  • Genome, Plant / genetics*
  • High-Throughput Nucleotide Sequencing
  • Lolium / genetics*
  • Molecular Sequence Annotation
  • Protein Isoforms / genetics
  • RNA, Long Noncoding / genetics
  • Single Molecule Imaging
  • Transcriptome / genetics*

Substances

  • Protein Isoforms
  • RNA, Long Noncoding