Single-molecule real-time sequencing of the full-length transcriptome of purple garlic (Allium sativum L. cv. Leduzipi) and identification of serine O-acetyltransferase family proteins involved in cysteine biosynthesis

J Sci Food Agric. 2022 May;102(7):2864-2873. doi: 10.1002/jsfa.11627. Epub 2021 Nov 13.

Abstract

Background: Garlic (Allium sativum L.), whose bioactive components are mainly organosulfur compounds (OSCs), is a herbaceous perennial widely consumed as a green vegetable and a condiment. Yet, the metabolic enzymes involved in the biosynthesis of OSCs are not identified in garlic.

Results: Here, a full-length transcriptome of purple garlic was generated via PacBio and Illumina sequencing, to characterize the garlic transcriptome and identify key proteins mediating the biosynthesis of OSCs. Overall, 22.56 Gb of clean data were generated, resulting in 454 698 circular consensus sequence (CCS) reads, of which 83.4% (379 206) were identified as being full-length non-chimeric reads - their further transcript clustering facilitated identification of 36 571 high-quality consensus reads. Once corrected, their genome-wide mapping revealed that 6140 reads were novel isoforms of known genes, and 2186 reads were novel isoforms from novel genes. We detected 1677 alternative splicing events, finding 2902 genes possessing either two or more poly(A) sites. Given the importance of serine O-acetyltransferase (SERAT) in cysteine biosynthesis, we investigated the five SERAT homologs in garlic. Phylogenetic analysis revealed a three-tier classification of SERAT proteins, each featuring a serine acetyltransferase domain (N-terminal) and one or two hexapeptide transferase motifs. Template-based modeling showed that garlic SERATs shared a common homo-trimeric structure with homologs from bacteria and other plants. The residues responsible for substrate recognition and catalysis were highly conserved, implying a similar reaction mechanism. In profiling the five SERAT genes' transcript levels, their expression pattern varied significantly among different tissues.

Conclusion: This study's findings deepen our knowledge of SERAT proteins, and provide timely genetic resources that could advance future exploration into garlic's genetic improvement and breeding. © 2021 Society of Chemical Industry.

Keywords: cysteine; full-length transcriptome; garlic; organosulfur compounds; serine O-acetyltransferase.

MeSH terms

  • Cysteine / metabolism
  • Garlic* / genetics
  • Garlic* / metabolism
  • Phylogeny
  • Plant Breeding
  • Protein Isoforms / genetics
  • Serine O-Acetyltransferase / genetics
  • Serine O-Acetyltransferase / metabolism
  • Transcriptome*

Substances

  • Protein Isoforms
  • Serine O-Acetyltransferase
  • Cysteine