Pacific Biosciences assembly with Hi-C mapping generates an improved, chromosome-level goose genome

Gigascience. 2020 Oct 24;9(10):giaa114. doi: 10.1093/gigascience/giaa114.

Abstract

Background: The domestic goose is an economically important and scientifically valuable waterfowl; however, a lack of high-quality genomic data has hindered research concerning its genome, genetics, and breeding. As domestic geese breeds derive from both the swan goose (Anser cygnoides) and the graylag goose (Anser anser), we selected a female Tianfu goose for genome sequencing. We generated a chromosome-level goose genome assembly by adopting a hybrid de novo assembly approach that combined Pacific Biosciences single-molecule real-time sequencing, high-throughput chromatin conformation capture mapping, and Illumina short-read sequencing.

Findings: We generated a 1.11-Gb goose genome with contig and scaffold N50 values of 1.85 and 33.12 Mb, respectively. The assembly contains 39 pseudo-chromosomes (2n = 78) accounting for ∼88.36% of the goose genome. Compared with previous goose assemblies, our assembly has more continuity, completeness, and accuracy; the annotation of core eukaryotic genes and universal single-copy orthologs has also been improved. We have identified 17,568 protein-coding genes and a repeat content of 8.67% (96.57 Mb) in this genome assembly. We also explored the spatial organization of chromatin and gene expression in the goose liver tissues, in terms of inter-pseudo-chromosomal interaction patterns, compartments, topologically associating domains, and promoter-enhancer interactions.

Conclusions: We present the first chromosome-level assembly of the goose genome. This will be a valuable resource for future genetic and genomic studies on geese.

Keywords: Hi-C; PacBio; annotation; chromosome-length assembly; goose genome; hybrid de novo assembly approaches.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Animals
  • Chromosomes / genetics
  • Female
  • Geese* / genetics
  • Genome*
  • Genomics
  • High-Throughput Nucleotide Sequencing
  • Molecular Sequence Annotation