Recovering High-Quality Host Genomes from Gut Metagenomic Data through Genotype Imputation

Adv Genet (Hoboken). 2022 May 6;3(3):2100065. doi: 10.1002/ggn2.202100065. eCollection 2022 Sep.

Abstract

Metagenomic datasets of host-associated microbial communities often contain host DNA that is usually discarded because the amount of data is too low for accurate host genetic analyses. However, genotype imputation can be employed to reconstruct host genotypes if a reference panel is available. Here, the performance of a two-step strategy is tested to impute genotypes from four types of reference panels built using different strategies to low-depth host genome data (≈2× coverage) recovered from intestinal samples of two chicken genetic lines. First, imputation accuracy is evaluated in 12 samples for which both low- and high-depth sequencing data are available, obtaining high imputation accuracies for all tested panels (>0.90). Second, the impact of reference panel choice in population genetics statistics on 100 chickens is assessed, all four panels yielding comparable results. In light of the observations, the feasibility and application of the applied imputation strategy are discussed for different species with regard to the host DNA proportion, genomic diversity, and availability of a reference panel. This method enables leveraging insofar discarded host DNA to get insights into the genetic structure of host populations, and in doing so, facilitates the implementation of hologenomic approaches that jointly analyze host and microbial genomic data.

Keywords: host; imputation; metagenomic data; population genetics.