Locating a novel autosomal recessive genetic variant in the cattle glucokinase gene using only WGS data from three cases and six carriers

Front Genet. 2022 Aug 29:13:755693. doi: 10.3389/fgene.2022.755693. eCollection 2022.

Abstract

New Mendelian genetic conditions, which adversely affect livestock, arise all the time. To manage them effectively, some methods need to be devised that are quick and accurate. Until recently, finding the causal genomic site of a new autosomal recessive genetic disease has required a two-stage approach using single-nucleotide polymorphism (SNP) chip genotyping to locate the region containing the new variant. This region is then explored using fine-mapping methods to locate the actual site of the new variant. This study explores bioinformatic methods that can be used to identify the causative variants of recessive genetic disorders with full penetrance with just nine whole genome-sequenced animals to simplify and expedite the process to a one-step procedure. Using whole genome sequencing of only three cases and six carriers, the site of a novel variant causing perinatal mortality in Irish moiled calves was located. Four methods were used to interrogate the variant call format (VCF) data file of these nine animals, they are genotype criteria (GCR), autozygosity-by-difference (ABD), variant prediction scoring, and registered SNP information. From more than nine million variants in the VCF file, only one site was identified by all four methods (Chr4: g.77173487A>T (ARS-UCD1.2 (GCF_002263795.1)). This site was a splice acceptor variant located in the glucokinase gene (GCK). It was verified on an independent sample of animals from the breed using genotyping by polymerase chain reaction at the candidate site and autozygosity-by-difference using SNP-chips. Both methods confirmed the candidate site. Investigation of the GCR method found that sites meeting the GCR were not evenly spread across the genome but concentrated in regions of long runs of homozygosity. Locating GCR sites was best performed using two carriers to every case, and the carriers should be distantly related to the cases, within the breed concerned. Fewer than 20 animals need to be sequenced when using the GCR and ABD methods together. The genomic site of novel autosomal recessive Mendelian genetic diseases can be located using fewer than 20 animals combined with two bioinformatic methods, autozygosity-by-difference, and genotype criteria. In many instances it may also be confirmed with variant prediction scoring. This should speed-up and simplify the management of new genetic diseases to a single-step process.

Keywords: Irish Moiled; WGS; cattle; glucokinase gene; perinatal mortality; recessive genetics; runs of homozigosity.