The sequence and de novo assembly of the wild yak genome

Sci Data. 2020 Feb 24;7(1):66. doi: 10.1038/s41597-020-0400-3.

Abstract

Vulnerable populations of wild yak (Bos mutus), the wild ancestral species of domestic yak, survive in extremely cold, harsh and oxygen-poor regions of the Qinghai-Tibetan Plateau (QTP) and adjacent high-altitude regions. In this study, we sequenced and assembled its genome de novo. In total, six different insert-size libraries were sequenced, and 662 Gb of clean data were generated. The assembled wild yak genome is 2.83 Gb in length, with an N50 contig size of 63.2 kb and a scaffold size of 16.3 Mb. BUSCO assessment indicated that 93.8% of the highly conserved mammal genes were completely present in the genome assembly. Annotation of the wild yak genome assembly identified 1.41 Gb (49.65%) of repetitive sequences and a total of 22,910 protein-coding genes, including 20,660 (90.18%) annotated with functional terms. This first construction of the wild yak genome provides a variable genetic resource that will facilitate further study of the genetic diversity of bovine species and accelerate yak breeding efforts.

Publication types

  • Dataset
  • Research Support, Non-U.S. Gov't

MeSH terms

  • Animals
  • Animals, Wild / genetics
  • Cattle / genetics*
  • Contig Mapping
  • Gene Library
  • Genome*
  • Sequence Analysis, DNA