Improved assembly and annotation of the sesame genome

DNA Res. 2022 Dec 1;29(6):dsac041. doi: 10.1093/dnares/dsac041.

Abstract

Sesame (Sesamum indicum L.) is an important oilseed crop that produces abundant seed oil and has a pleasant flavor and high nutritional value. To date, several Illumina-based genome assemblies corresponding to different sesame genotypes have been published and widely used in genetic and genomic studies of sesame. However, these assemblies consistently showed low continuity with numerous gaps. Here, we reported a high-quality, reference-level sesame genome assembly by integrating PacBio high-fidelity sequencing and Hi-C technology. Our updated sesame assembly was 309.35 Mb in size with a high chromosome anchoring rate (97.54%) and contig N50 size (13.48 Mb), which were better than previously published genomes. We identified 163.38 Mb repetitive elements and 24,345 high-confidence protein-coding genes in the updated sesame assembly. Comparative genomic analysis showed that sesame shared an ancient whole-genome duplication event with two Lamiales species. A total of 2,782 genes were tandemly duplicated. We also identified several genes that were likely involved in fatty acid and triacylglycerol biosynthesis. Our improved sesame assembly and annotation will facilitate future genetic studies and genomics-assisted breeding of sesame.

Keywords: PacBio high-fidelity sequencing; genome assembly; oil metabolism; oilseed crop; sesame.

MeSH terms

  • Genomics
  • Sesamum* / genetics