Malaria haplotype frequency estimation

Stat Med. 2013 Sep 20;32(21):3737-51. doi: 10.1002/sim.5792. Epub 2013 Apr 23.

Abstract

We present a Bayesian approach for estimating the relative frequencies of multi-single nucleotide polymorphism (SNP) haplotypes in populations of the malaria parasite Plasmodium falciparum by using microarray SNP data from human blood samples. Each sample comes from a malaria patient and contains one or several parasite clones that may genetically differ. Samples containing multiple parasite clones with different genetic markers pose a special challenge. The situation is comparable with a polyploid organism. The data from each blood sample indicates whether the parasites in the blood carry a mutant or a wildtype allele at various selected genomic positions. If both mutant and wildtype alleles are detected at a given position in a multiply infected sample, the data indicates the presence of both alleles, but the ratio is unknown. Thus, the data only partially reveals which specific combinations of genetic markers (i.e. haplotypes across the examined SNPs) occur in distinct parasite clones. In addition, SNP data may contain errors at non-negligible rates. We use a multinomial mixture model with partially missing observations to represent this data and a Markov chain Monte Carlo method to estimate the haplotype frequencies in a population. Our approach addresses both challenges, multiple infections and data errors.

Keywords: Bayesian mixture model; Gibbs sampling; frequency estimation; haplotypes; malaria; multiple infection.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Algorithms
  • Animals
  • Data Interpretation, Statistical*
  • Genetic Variation / genetics*
  • Haplotypes / genetics
  • Humans
  • Malaria, Falciparum / blood
  • Malaria, Falciparum / genetics*
  • Malaria, Falciparum / parasitology
  • Markov Chains
  • Models, Statistical*
  • Monte Carlo Method
  • Oligonucleotide Array Sequence Analysis
  • Papua New Guinea
  • Plasmodium falciparum / genetics*
  • Polymorphism, Single Nucleotide / genetics*