Genome assembly of Erythrophleum Fordii, a special "ironwood" tree in China

BMC Genom Data. 2023 Nov 28;24(1):73. doi: 10.1186/s12863-023-01176-9.

Abstract

Objectives: Erythrophleum is a genus in the Fabaceae family. The genus contains only about 10 species, and it is best known for its hardwood and medical properties worldwide. Erythrophleum fordii Oliv. is the only species of this genus distributed in China. It has superior wood and can be used in folk medicine, which leads to its overexploitation in the wild. For its effective conservation and elucidation of the distinctive genetic traits of wood formation and medical components, we present its first genome assembly.

Data description: This work generated ~ 160.8 Gb raw Nanopore whole genome sequencing (WGS) long reads, ~ 126.0 Gb raw MGI WGS short reads and ~ 29.0 Gb raw RNA-seq reads using E. fordii leaf tissues. The de novo assembly contained 864,825,911 bp in the E. fordii genome, with 59 contigs and a contig N50 of 30,830,834 bp. Benchmarking Universal Single-Copy Orthologs (BUSCO) revealed 98.7% completeness of the assembly. The assembly contained 471,006,885 bp (54.4%) repetitive sequences and 28,761 genes that coded for 33,803 proteins. The protein sequences were functionally annotated against multiple databases, facilitating comparative genomic analysis.

Keywords: De novo assembly; Gene annotation; Genome feature; Genome survey; RNA-seq.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • China
  • Fabaceae*
  • Genome
  • Molecular Sequence Annotation
  • Trees*