Determination of the evolutionary pressure on Camellia oleifera on Hainan Island using the complete chloroplast genome sequence

PeerJ. 2019 Jun 26:7:e7210. doi: 10.7717/peerj.7210. eCollection 2019.

Abstract

Camellia oleifera is one of the four largest woody edible oil plants in the world with high ecological and medicinal values. Due to frequent interspecific hybridization, it was difficult to study its genetics and evolutionary history. This study used C. oleifera that was collected on Hainan Island to conduct our research. The unique island environment makes the quality of tea oil higher than that of other species grown in the mainland. Moreover, a long-term geographic isolation might affect gene structure. In order to better understand the molecular biology of this species, protect excellent germplasm resources, and promote the population genetics and phylogenetic studies of Camellia plants, high-throughput sequencing technology was used to obtain the chloroplast genome sequence of Hainan C. oleifera. The results showed that the whole chloroplast genome of C. oleifera in Hainan was 156,995 bp in length, with a typical quadripartite structure of a large single copy (LSC) region of 86,648 bp, a small single copy (SSC) region of 18,297 bp, and a pair of inverted repeats (IRs) of 26,025 bp. The whole genome encoded a total of 141 genes (115 different genes), including 88 protein-coding genes, 45 tRNA genes, and eight rRNA genes. Among these genes, nine genes contained one intron, two genes contained two introns, and four overlapping genes were also detected. The total GC content of Hainan C. oleifera's chloroplast genome was 37.29%. The chloroplast genome structure characteristics of Hainan C. oleifera were compared with mainland C. oleifera and those of the other eight closely related Theaceae species; it was found that the contractions and expansions of the IR/LSC and IR/SSC regions affected the length of chloroplast genome. The chloroplast genome sequences of these Theaceae species were highly similar. A comparative analysis indicated that the Theaceae species were conserved in structure and evolution. A total of 51 simple sequence repeat (SSR) loci were detected in the chloroplast genome of Hainan C. oleifera, and all Camellia plants did not have pentanucleotide repeats, which could be used as a good marker in phylogenetic studies. We also detected seven long repeats, the base composition of all repeats was biased toward A/T, which was consistent with the codon bias. It was found that Hainan C. oleifera had a similar evolutionary relationship with C. crapnelliana, through the use of codons and phylogenetic analysis. This study can provide an effective genomic resource for the evolutionary history of Theaceae family.

Keywords: Camellia oleifera; Chloroplast genome; Codon usage; Evolution pressure; Island plant; Repeat analysis; SSR.

Grants and funding

This research was supported by the Science and technology major project of Hunan province (2017NK1014), the Key Technology R&D Program of Hunan Province (2016TP2007, 2016NK2148), and the National Key Technology Research and Development Program of China (2014BAC09B03-02). There was no additional external funding received for this study. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.