Use of overlapping DNA pools to discern genetic differences despite pooling error

J Anim Sci. 2023 Jan 3:101:skad166. doi: 10.1093/jas/skad166.

Abstract

Genotyping pools of commercial cattle and individual seedstock animals may reveal hidden relationships between sectors enabling use of commercial data for genetic evaluation. However, commercial data capture may be compromised by inexact pool formation. We aimed to estimate the concordance between distances or genomic covariance among pooling allele frequencies (PAFs) of DNA pools comprised of 100 animals with 0% or 50% overlap of animals in common between pools. Cattle lung samples were collected from a commercial beef processing plant on a single day. Six pools of 100 animals each were constructed so that overlap between pools was 0% or 50%. Two pools of all 200 animals were constructed to estimate PAFs for all 200 animals. Frozen lung tissue (0.01 g) from each animal was weighed into a tube containing a pool; there were two pools of 200 animals each and six pools of 100 animals each. Every contribution of an individual animal was an independent measurement to insure independence of pooling errors. Lung samples were kept on dried ice during the pooling process to keep them from thawing. The eight pools were then assayed for approximately 100,000 single nucleotide polymorphisms (SNP). PAF for each SNP and pool was based on the relative intensity of the two dyes used to detect the alleles rather than genotype calls which are not tractable from pooling data. Euclidean distances and genomic relationships among the PAFs for the eight pools were estimated and distances were tested for concordance with pool overlap using permutation-based analysis of distance. Distances among pools were concordant with the planned overlap of animals shared between pools (P = 0.0024); pool overlap accounted for 70% of the variation and pooling error accounted for 30%. Pools containing 100 animals with no overlap were the most distant from one another and pools with 50% overlap were the least distant. This work shows that we can discern differences in distance between pairs of overlapping DNA pools sharing 0% and 50% of the animals. Genomic correlations among nonoverlapping pools indicated that nonoverlapping pool pairs did not share many related animals because genomic correlations were near zero for these pairs. On the other hand, one pair of nonoverlapping pools likely contained related animals between pools because the correlation was 0.21. Pools sharing 50% overlap ranged in genomic relationship between 0.21 and 0.39 (N = 12).

Keywords: DNA pooling; beef cattle; genetic markers; genetic relationship; genotyping array.

Plain language summary

Genetic evaluation of seedstock cattle could benefit from commercial data. There are hidden relationships between commercial and seedstock sectors because many commercial producers buy bulls from the seedstock sector. Relationships are hidden because pedigree is not tracked in commercial populations. Single nucleotide polymorphism genotypes could reveal these hidden relationships; however, genotyping can be cost prohibitive. Cost of commercial data capture could be decreased by pooling DNA which is a method to genotype groups of animals to use their data in genetic evaluation; however, error from inexact pool formation can complicate interpretation. Results from pools of overlapping random unrelated animals mimic the results from pools sharing relatives with the same degree of shared genomes. For example, a pool of progeny and a pool of the dams of the pooled progeny would produce the same result as two pools sharing 50% overlap of random unrelated animals. We can estimate the relatedness between unknown pools even in the presence of pooling error if an unknown pool comparison is similar to an overlapping pool comparison. Knowing the relationship between seedstock cattle and pools of commercial cattle may allow commercial data to enhance genetic evaluation of seedstock animals.

MeSH terms

  • Alleles
  • Animals
  • Cattle / genetics
  • DNA* / genetics
  • Gene Frequency
  • Genomics*
  • Genotype
  • Polymorphism, Single Nucleotide

Substances

  • DNA