A rapid and reference-free imputation method for low-cost genotyping platforms

Sci Rep. 2023 Dec 27;13(1):23083. doi: 10.1038/s41598-023-50086-4.

Abstract

Most current genotype imputation methods are reference-based, which posed several challenges to users, such as high computational costs and reference panel inaccessibility. Thus, deep learning models are expected to create reference-free imputation methods performing with higher accuracy and shortening the running time. We proposed a imputation method using recurrent neural networks integrating with an additional discriminator network, namely GRUD. This method was applied to datasets from genotyping chips and Low-Pass Whole Genome Sequencing (LP-WGS) with the reference panels from The 1000 Genomes Project (1KGP) phase 3, the dataset of 4810 Singaporeans (SG10K), and The 1000 Vietnamese Genome Project (VN1K). Our model performed more accurately than other existing methods on multiple datasets, especially with common variants with large minor allele frequency, and shrank running time and memory usage. In summary, these results indicated that GRUD can be implemented in genomic analyses to improve the accuracy and running-time of genotype imputation.

MeSH terms

  • Gene Frequency
  • Genome*
  • Genome-Wide Association Study / methods
  • Genotype
  • Genotyping Techniques / methods
  • Humans
  • Polymorphism, Single Nucleotide*

Supplementary concepts

  • Singaporean people