A new genomic resource dedicated to wood formation in Eucalyptus

BMC Plant Biol. 2009 Mar 27:9:36. doi: 10.1186/1471-2229-9-36.

Abstract

Background: Renowned for their fast growth, valuable wood properties and wide adaptability, Eucalyptus species are amongst the most planted hardwoods in the world, yet they are still at the early stages of domestication because conventional breeding is slow and costly. Thus, there is huge potential for marker-assisted breeding programs to improve traits such as wood properties. To this end, the sequencing, analysis and annotation of a large collection of expressed sequences tags (ESTs) from genes involved in wood formation in Eucalyptus would provide a valuable resource.

Results: We report here the normalization and sequencing of a cDNA library from developing Eucalyptus secondary xylem, as well as the construction and sequencing of two subtractive libraries (juvenile versus mature wood and vice versa). A total of 9,222 high quality sequences were collected from about 10,000 cDNA clones. The EST assembly generated a set of 3,857 wood-related unigenes including 2,461 contigs (Cg) and 1,396 singletons (Sg) that we named 'EUCAWOOD'. About 65% of the EUCAWOOD sequences produced matches with poplar, grapevine, Arabidopsis and rice protein sequence databases. BlastX searches of the Uniref100 protein database allowed us to allocate gene ontology (GO) and protein family terms to the EUCAWOOD unigenes. This annotation of the EUCAWOOD set revealed key functional categories involved in xylogenesis. For instance, 422 sequences matched various gene families involved in biosynthesis and assembly of primary and secondary cell walls. Interestingly, 141 sequences were annotated as transcription factors, some of them being orthologs of regulators known to be involved in xylogenesis. The EUCAWOOD dataset was also mined for genomic simple sequence repeat markers, yielding a total of 639 putative microsatellites. Finally, a publicly accessible database was created, supporting multiple queries on the EUCAWOOD dataset.

Conclusion: In this work, we have identified a large set of wood-related Eucalyptus unigenes called EUCAWOOD, thus creating a valuable resource for functional genomics studies of wood formation and molecular breeding in this economically important genus. This set of publicly available annotated sequences will be instrumental for candidate gene approaches, custom array development and marker-assisted selection programs aimed at improving and modulating wood properties.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • DNA, Plant / genetics
  • Databases, Nucleic Acid*
  • Eucalyptus / genetics*
  • Eucalyptus / metabolism
  • Expressed Sequence Tags
  • Gene Expression Regulation, Plant
  • Gene Library
  • Genome, Plant*
  • Genomics
  • Sequence Analysis, DNA
  • Wood / genetics
  • Wood / metabolism*

Substances

  • DNA, Plant