A survey of transcriptome complexity using full-length isoform sequencing in the tea plant Camellia sinensis

Mol Genet Genomics. 2022 Sep;297(5):1243-1255. doi: 10.1007/s00438-022-01913-2. Epub 2022 Jun 28.

Abstract

Tea is one of the most popular beverages and its leaves are rich in catechins, contributing to the diverse flavor as well as beneficial for human health. However, the study of the post-transcriptional regulatory mechanism affecting the synthesis of catechins remains insufficient. Here, we sequenced the transcriptome using PacBio sequencing technology and obtained 63,111 full-length high-quality isoforms, including 1302 potential novel genes and 583 highly reliable fusion transcripts. We also identified 1204 lncRNAs with high quality, containing 188 known and 1016 novel lncRNAs. In addition, 311 mis-annotated genes were corrected based on the high-quality Isoseq reads. A large number of alternative splicing (AS) events (3784) and alternative polyadenylation (APA) genes (18,714) were analyzed, accounting for 8.84% and 43.7% of the total annotated genes, respectively. We also found that 2884 genes containing AS and APA features exhibited higher expression levels than other genes. These genes are mainly involved in amino acid biosynthesis, carbon fixation in photosynthetic organisms, phenylalanine, tyrosine, tryptophan biosynthesis, and pyruvate metabolism, suggesting that they play an essential role in the catechins content of tea polyphenols. Our results further improved the level of genome annotation and indicated that post-transcriptional regulation plays a crucial part in synthesizing catechins.

Keywords: Alternative polyadenylation; Alternative splicing; Catechins; LncRNA; PacBio Iso-Seq; Tea plant.

MeSH terms

  • Alternative Splicing
  • Camellia sinensis*
  • Catechin*
  • Gene Expression Regulation, Plant
  • Humans
  • Plant Leaves
  • Plant Proteins
  • Protein Isoforms
  • RNA, Long Noncoding*
  • Tea
  • Transcriptome

Substances

  • Plant Proteins
  • Protein Isoforms
  • RNA, Long Noncoding
  • Tea
  • Catechin