Estimating microhaplotype allele frequencies from low-coverage or pooled sequencing data

BMC Bioinformatics. 2023 Nov 3;24(1):415. doi: 10.1186/s12859-023-05554-z.

Abstract

Background: Microhaplotypes have the potential to be more cost-effective than SNPs for applications that require genetic panels of highly variable loci. However, development of microhaplotype panels is hindered by a lack of methods for estimating microhaplotype allele frequency from low-coverage whole genome sequencing or pooled sequencing (pool-seq) data.

Results: We developed new methods for estimating microhaplotype allele frequency from low-coverage whole genome sequence and pool-seq data. We validated these methods using datasets from three non-model organisms. These methods allowed estimation of allele frequency and expected heterozygosity at depths routinely achieved from pooled sequencing.

Conclusions: These new methods will allow microhaplotype panels to be designed using low-coverage WGS and pool-seq data to discover and evaluate candidate loci. The python script implementing the two methods and documentation are available at https://www.github.com/delomast/mhFromLowDepSeq .

Keywords: Genotype panel design; Low-depth whole genome sequencing; Microhaplotype; Pool-seq; Skim-seq.

MeSH terms

  • Gene Frequency
  • High-Throughput Nucleotide Sequencing* / methods
  • Polymorphism, Single Nucleotide*
  • Whole Genome Sequencing