Genome assembly and microsatellite marker development using Illumina and PacBio sequencing in Persicaria maackiana (Polygonaceae) from Korea

Genes Genomics. 2024 Feb;46(2):187-202. doi: 10.1007/s13258-023-01479-2. Epub 2024 Jan 19.

Abstract

Background: Persicaria maackiana (Regel) is a potential medicinal plant that exerts anti-diabetic effects. However, the lack of genomic information on P. maackiana hinders research at the molecular level.

Objective: Herein, we aimed to construct a draft genome assembly and obtain comprehensive genomic information on P. maackiana using high-throughput sequencing tools PacBio Sequel II and Illumina.

Methods: Persicaria maackiana samples from three natural populations in Gaecheon, Gichi, and Uiryeong reservoirs in South Korea were used to generate genomic DNA libraries, perform genome de novo assembly, gene ontology analysis, phylogenetic tree analysis, genotyping, and identify microsatellite markers.

Results: The assembled P. maackiana genome yielded 32,179 contigs. Assessment of assembly integrity revealed 1503 (93.12%) complete Benchmarking Universal Single-Copy Orthologs. A total of 64,712 protein-coding genes were predicted and annotated successfully in the protein database. In the Kyoto Encyclopedia of Genes and Genomes (KEGG) orthologs, 13,778 genes were annotated into 18 categories. Genes that activated AMPK were identified in the KEGG pathway. A total of 316,992 microsatellite loci were identified, and primers targeting the flanking regions were developed for 292,059 microsatellite loci. Of these, 150 primer sets were randomly selected for amplification, and 30 of these primer sets were identified as polymorphic. These primers amplified 3-9 alleles. The mean observed and expected heterozygosity were 0.189 and 0.593, respectively. Polymorphism information content values of the markers were 0.361-0.754.

Conclusion: Collectively, our study provides a valuable resource for future comparative genomics, phylogeny, and population studies of P. maackiana.

Keywords: Genome assembly; Genomics; Medicinal plants; Next-generation sequencing; Phylogeny; Polymorphism information content.

MeSH terms

  • Genomics
  • Microsatellite Repeats / genetics
  • Molecular Sequence Annotation
  • Phylogeny
  • Polygonaceae* / genetics