Identification and phylogenetic analysis of five Crataegus species (Rosaceae) based on complete chloroplast genomes

Planta. 2021 Jun 28;254(1):14. doi: 10.1007/s00425-021-03667-4.

Abstract

The chloroplast genomes of the five Crataegus species were shown to have a conserved genome structure. Complete chloroplast genome sequences were more suitable than highly variable regions for the identification and phylogenetic analysis of Crataegus species. Hawthorn, which is commonly used as a traditional Chinese medicine, is one of the most popular sour fruits and has high economic value. Crataegus pinnatifida var. pinnatifida and C. pinnatifida var. major are frequently adulterated with other Crataegus species on the herbal medicine market. However, most Crataegus plants are difficult to identify using traditional morphological methods. Here, we compared five Crataegus chloroplast (CP) genomes comprising two newly sequenced (i.e., C. pinnatifida var. pinnatifida and C. pinnatifida var. major) and three previously published CP genomes. The CP genomes of the five Crataegus species had a conserved genome structure, gene content and codon usage. The total length of the CP genomes was 159,654-159,865 bp. A total of 129-130 genes, including 84-85 protein-coding genes, 37 tRNA genes and 8 rRNA genes, were annotated. Bioinformatics analysis revealed 96-103 simple sequence repeats (SSRs) and 48-70 long repeats in the five CP genomes. Combining the results of mVISTA and nucleotide diversity, five highly variable regions were screened for species identification and relationship studies. Maximum likelihood trees were constructed on the basis of complete CP genome sequences and highly variable regions. The results showed that the former had higher discriminatory power for Crataegus species, indicating that the complete CP genome could be used as a super-barcode to accurately authenticate the five Crataegus species.

Keywords: Chloroplast genome; Crataegus pinnatifida; Phylogenetic; Species identification; Super-barcode.

MeSH terms

  • Crataegus*
  • Genome, Chloroplast*
  • Microsatellite Repeats
  • Phylogeny
  • Rosaceae*