De novo assembly of the cattle reference genome with single-molecule sequencing

Gigascience. 2020 Mar 1;9(3):giaa021. doi: 10.1093/gigascience/giaa021.

Abstract

Background: Major advances in selection progress for cattle have been made following the introduction of genomic tools over the past 10-12 years. These tools depend upon the Bos taurus reference genome (UMD3.1.1), which was created using now-outdated technologies and is hindered by a variety of deficiencies and inaccuracies.

Results: We present the new reference genome for cattle, ARS-UCD1.2, based on the same animal as the original to facilitate transfer and interpretation of results obtained from the earlier version, but applying a combination of modern technologies in a de novo assembly to increase continuity, accuracy, and completeness. The assembly includes 2.7 Gb and is >250× more continuous than the original assembly, with contig N50 >25 Mb and L50 of 32. We also greatly expanded supporting RNA-based data for annotation that identifies 30,396 total genes (21,039 protein coding). The new reference assembly is accessible in annotated form for public use.

Conclusions: We demonstrate that improved continuity of assembled sequence warrants the adoption of ARS-UCD1.2 as the new cattle reference genome and that increased assembly accuracy will benefit future research on this species.

Keywords: Hereford; bovine genome; cattle; reference assembly.

Publication types

  • Research Support, N.I.H., Extramural
  • Research Support, N.I.H., Intramural
  • Research Support, Non-U.S. Gov't
  • Research Support, U.S. Gov't, Non-P.H.S.

MeSH terms

  • Animals
  • Breeding / methods
  • Breeding / standards*
  • Cattle / genetics*
  • Genome*
  • Genomics / methods
  • Genomics / standards*
  • Polymorphism, Genetic*
  • RNA-Seq / methods
  • RNA-Seq / standards
  • Reference Standards
  • Sequence Analysis, DNA / methods
  • Sequence Analysis, DNA / standards