Chloroplast genome assembly of Serjania erecta Raldk: comparative analysis reveals gene number variation and selection in protein-coding plastid genes of Sapindaceae

Front Plant Sci. 2023 Sep 26:14:1258794. doi: 10.3389/fpls.2023.1258794. eCollection 2023.

Abstract

Serjania erecta Raldk is an essential genetic resource due to its anti-inflammatory, gastric protection, and anti-Alzheimer properties. However, the genetic and evolutionary aspects of the species remain poorly known. Here, we sequenced and assembled the complete chloroplast genome of S. erecta and used it in a comparative analysis within the Sapindaceae family. S. erecta has a chloroplast genome (cpDNA) of 159,297 bp, divided into a Large Single Copy region (LSC) of 84,556 bp and a Small Single Copy region (SSC) of 18,057 bp that are surrounded by two Inverted Repeat regions (IRa and IRb) of 28,342 bp. Among the 12 species used in the comparative analysis, S. erecta has the fewest long and microsatellite repeats. The genome structure of Sapindaceae species is relatively conserved; the number of genes varies from 128 to 132 genes, and this variation is associated with three main factors: (1) Expansion and retraction events in the size of the IRs, resulting in variations in the number of rpl22, rps19, and rps3 genes; (2) Pseudogenization of the rps2 gene; and (3) Loss or duplication of genes encoding tRNAs, associated with the duplication of trnH-GUG in X. sorbifolium and the absence of trnT-CGU in the Dodonaeoideae subfamily. We identified 10 and 11 mutational hotspots for Sapindaceae and Sapindoideae, respectively, and identified six highly diverse regions (tRNA-Lys - rps16, ndhC - tRNA-Val, petA - psbJ, ndhF, rpl32 - ccsA, and ycf1) are found in both groups, which show potential for the development of DNA barcode markers for molecular taxonomic identification of Serjania. We identified that the psaI gene evolves under neutrality in Sapindaceae, while all other chloroplast genes are under strong negative selection. However, local positive selection exists in the ndhF, rpoC2, ycf1, and ycf2 genes. The genes ndhF and ycf1 also present high nucleotide diversity and local positive selection, demonstrating significant potential as markers. Our findings include providing the first chloroplast genome of a member of the Paullinieae tribe. Furthermore, we identified patterns in variations in the number of genes and selection in genes possibly associated with the family's evolutionary history.

Keywords: cpDNA; molecular evolution; negative selection; organellar genome; plastome.

Grants and funding

The authors declare financial support was received for the research, authorship, and/or publication of this article. This work was developed in the context of Instituto Nacional de Ciência e Tecnologia em Ecologia, Evolução e Conservação da Biodiversidade (INCT – EECBio), supported by Conselho Nacional de Desenvolvimento Científico e Tecnológico (CNPq – process 465610/20145) and Fundação de Amparo à Pesquisa do Estado de Goiás (FAPEG – process 201810267000023). We are also thankful for the support from PPGS Coordenação de Aperfeiçoamento de Pessoal de Nível Superior (CAPES)/FAPEG (#08/2014) and CNPq (MCTIC/CNPq #28/2018, 435477/2018-8). RN was supported by a PDCTR fellowship from FAPEG/CNPq (#202110267000863). RB-F was supported by a DTI fellowship from CNPq. A PNPD scholarship from CAPES supported CT. A productivity grant from CNPq has continuously supported MT, CS-N, and JD-f.