Initial Analysis of Structural Variation Detections in Cattle Using Long-Read Sequencing Methods

Genes (Basel). 2022 May 6;13(5):828. doi: 10.3390/genes13050828.

Abstract

Structural variations (SVs), as a great source of genetic variation, are widely distributed in the genome. SVs involve longer genomic sequences and potentially have stronger effects than SNPs, but they are not well captured by short-read sequencing owing to their size and relevance to repeats. Improved characterization of SVs can provide more advanced insight into complex traits. With the availability of long-read sequencing, it has become feasible to uncover the full range of SVs. Here, we sequenced one cattle individual using 10× Genomics (10 × G) linked read, Pacific Biosciences (PacBio) continuous long reads (CLR) and circular consensus sequencing (CCS), as well as Oxford Nanopore Technologies (ONT) PromethION. We evaluated the ability of various methods for SV detection. We identified 21,164 SVs, which amount to 186 Mb covering 7.07% of the whole genome. The number of SVs inferred from long-read-based inferences was greater than that from short reads. The PacBio CLR identified the most of large SVs and covered the most genomes. SVs called with PacBio CCS and ONT data showed high uniformity. The one with the most overlap with the results obtained by short-read data was PB CCS. Together, we found that long reads outperformed short reads in terms of SV detections.

Keywords: cattle; long-read sequencing; structural variation.

Publication types

  • Research Support, U.S. Gov't, Non-P.H.S.

MeSH terms

  • Animals
  • Cattle / genetics
  • Genome
  • Genomics
  • High-Throughput Nucleotide Sequencing* / methods
  • Nanopores*
  • Sequence Analysis, DNA / methods

Grants and funding

This work was supported in part by USDA National Institute of Food and Agriculture (NIFA) Agriculture and Food Research Initiative (AFRI) grant numbers 2016-67015-24886, 2019-67015-29321, and 2021-67015-33409.