Genome sequence of the agarwood tree Aquilaria sinensis (Lour.) Spreng: the first chromosome-level draft genome in the Thymelaeceae family

Gigascience. 2020 Mar 1;9(3):giaa013. doi: 10.1093/gigascience/giaa013.

Abstract

Backgroud: Aquilaria sinensis (Lour.) Spreng is one of the important plant resources involved in the production of agarwood in China. The agarwood resin collected from wounded Aquilaria trees has been used in Asia for aromatic or medicinal purposes from ancient times, although the mechanism underlying the formation of agarwood still remains poorly understood owing to a lack of accurate and high-quality genetic information.

Findings: We report the genomic architecture of A. sinensis by using an integrated strategy combining Nanopore, Illumina, and Hi-C sequencing. The final genome was ∼726.5 Mb in size, which reached a high level of continuity and a contig N50 of 1.1 Mb. We combined Hi-C data with the genome assembly to generate chromosome-level scaffolds. Eight super-scaffolds corresponding to the 8 chromosomes were assembled to a final size of 716.6 Mb, with a scaffold N50 of 88.78 Mb using 1,862 contigs. BUSCO evaluation reveals that the genome completeness reached 95.27%. The repeat sequences accounted for 59.13%, and 29,203 protein-coding genes were annotated in the genome. According to phylogenetic analysis using single-copy orthologous genes, we found that A. sinensis is closely related to Gossypium hirsutum and Theobroma cacao from the Malvales order, and A. sinensis diverged from their common ancestor ∼53.18-84.37 million years ago.

Conclusions: Here, we present the first chromosome-level genome assembly and gene annotation of A. sinensis. This study should contribute to valuable genetic resources for further research on the agarwood formation mechanism, genome-assisted improvement, and conservation biology of Aquilaria species.

Keywords: Aquilaria sinensis; Hi-C sequencing; agarwood; annotation; chromosome-level genome assembly.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Chromosomes, Plant / genetics*
  • Contig Mapping
  • Genome, Plant*
  • Molecular Sequence Annotation
  • Phylogeny
  • Plant Proteins / genetics
  • Thymelaeaceae / classification
  • Thymelaeaceae / genetics*
  • Whole Genome Sequencing

Substances

  • Plant Proteins