A chromosome-level genome assembly of the Chinese cork oak (Quercus variabilis)

Front Plant Sci. 2022 Sep 23:13:1001583. doi: 10.3389/fpls.2022.1001583. eCollection 2022.

Abstract

Quercus variabilis (Fagaceae) is an ecologically and economically important deciduous broadleaved tree species native to and widespread in East Asia. It is a valuable woody species and an indicator of local forest health, and occupies a dominant position in forest ecosystems in East Asia. However, genomic resources from Q. variabilis are still lacking. Here, we present a high-quality Q. variabilis genome generated by PacBio HiFi and Hi-C sequencing. The assembled genome size is 787 Mb, with a contig N50 of 26.04 Mb and scaffold N50 of 64.86 Mb, comprising 12 pseudo-chromosomes. The repetitive sequences constitute 67.6% of the genome, of which the majority are long terminal repeats, accounting for 46.62% of the genome. We used ab initio, RNA sequence-based and homology-based predictions to identify protein-coding genes. A total of 32,466 protein-coding genes were identified, of which 95.11% could be functionally annotated. Evolutionary analysis showed that Q. variabilis was more closely related to Q. suber than to Q. lobata or Q. robur. We found no evidence for species-specific whole genome duplications in Quercus after the species had diverged. This study provides the first genome assembly and the first gene annotation data for Q. variabilis. These resources will inform the design of further breeding strategies, and will be valuable in the study of genome editing and comparative genomics in oak species.

Keywords: Hi-C sequencing; PacBio HiFi sequencing; Quercus variabilis; comparative genomics; genome assembly.