Full-Length Transcriptome from Camellia oleifera Seed Provides Insight into the Transcript Variants Involved in Oil Biosynthesis

J Agric Food Chem. 2020 Dec 9;68(49):14670-14683. doi: 10.1021/acs.jafc.0c05381. Epub 2020 Nov 29.

Abstract

Camellia oleifera Abel., belonging to the genus Camellia of Theaceae, has been widely used as a cooking oil, lubricant, and in cosmetics. Because of complicated polyploidization and large genomes, reference genome information is still lacking. Systematic characterization of gene models based on transcriptome data is a fast and economical approach for C. oleifera. Pacific Biosciences single-molecule long-read isoform sequencing (Iso-Seq) and Illumina RNA-Seq combined with gas chromatography were performed for exploration of oil biosynthesis, accumulation, and comprehensive transcriptome analysis in C. oleifera seeds at five different developmental stages. We report the first full-length transcriptome data set of C. oleifera seeds comprising 40,143 deredundant high-quality isoforms. Among these isoforms, 37,982 were functionally annotated, and 271 (2.43%) belonged to fatty acid metabolism. A total of 8,344 full-length unique transcript models were obtained, and 8,151 (97.69%) of them produced more than two isoforms, suggesting a high degree of transcriptome complexity in C. oleifera seeds. A total of 783 alternative splicing (AS) events were identified, among which the retained intron was the most abundant. We also obtained 1,910 long noncoding RNAs (lncRNAs) and found that AS events occurred in these lncRNAs. Potential transcript variants of genes involved in oil biosynthesis were also investigated. After performing weighted correlation network analysis, we found seven "gene modules" and hub genes for each module showing a significant association with oil content. The series test of clusters classified these modules into four significant profiles based on gene expression patterns. Protein-protein interaction network analysis showed that upregulated WRI1 interacted with 17 genes encoding the enzymes playing key roles in oil synthesis. MYB and ZIP transcriptional factors also showed significant interactions with key genes involved in oil synthesis. Collectively, our data advance the knowledge of RNA isoform diversity in seeds at different developmental stages and provide a rich resource for functional studies on oil synthesis in C. oleifera.

Keywords: Alternative splicing; Camellia oleifera; Iso-Seq; LncRNA; Oil synthesis; transcriptional factors.

MeSH terms

  • Alternative Splicing
  • Camellia / chemistry
  • Camellia / genetics*
  • Camellia / metabolism
  • Gene Expression Profiling
  • Plant Oils / metabolism*
  • Plant Proteins / genetics*
  • Plant Proteins / metabolism
  • Seeds / chemistry
  • Seeds / genetics
  • Seeds / metabolism
  • Transcriptome

Substances

  • Plant Oils
  • Plant Proteins