Whole-genome assembly and annotation of northern wild rice, Zizania palustris L., supports a whole-genome duplication in the Zizania genus

Plant J. 2021 Sep;107(6):1802-1818. doi: 10.1111/tpj.15419. Epub 2021 Aug 14.

Abstract

Zizania palustris L. (northern wild rice, NWR) is an aquatic grass native to North America that is notable for its nutritious grain. This is an important species with ecological, cultural and agricultural significance, specifically in the Great Lakes region of the USA. Using flow cytometry, we first estimated the NWR genome size to be 1.8 Gb. Using long- and short-range sequencing, Hi-C scaffolding and RNA-seq data from eight tissues, we generated an annotated whole-genome de novo assembly of NWR. The assembly was 1.29 Gb in length, highly repetitive (approx. 76.0%) and contained 46 421 putative protein-coding genes. The expansion of retrotransposons within the genome and a whole-genome duplication (WGD) after the Zizania-Oryza speciation event have both led to an increase in the genome size of NWR in comparison with Oryza sativa L. and Zizania latifolia. Both events depict a genome rapidly undergoing change over a short evolutionary time. Comparative analyses revealed the conservation of large syntenic blocks between NWR and O. sativa, which were used to identify putative seed-shattering genes. Estimates of divergence times revealed that the Zizania genus diverged from Oryza approximately 26-30 million years ago (26-30 MYA), whereas NWR and Z. latifolia diverged from one another approximately 6-8 MYA. Comparative genomics confirmed evidence of a WGD in the Zizania genus and provided support that the event occurred prior to the NWR-Z. latifolia speciation event. This genome assembly and annotation provides a valuable resource for comparative genomics in the Oryzeae tribe and provides an important resource for future conservation and breeding efforts of NWR.

Keywords: Zizania palustris; PacBio sequencing; RNA-seq; annotation; de novo assembly; northern wild rice; whole-genome duplication.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Evolution, Molecular
  • Flow Cytometry
  • Gene Duplication
  • Genome Size
  • Genome, Plant*
  • Genomics
  • Minnesota
  • Molecular Sequence Annotation
  • Oryza / genetics*
  • Phylogeny
  • Plant Breeding
  • Poaceae / genetics*
  • Repetitive Sequences, Nucleic Acid
  • Transcriptome

Associated data

  • RefSeq/PRJNA574141
  • RefSeq/PRJNA600525
  • RefSeq/JAAALK000000000
  • RefSeq/JAAALK010000000