Exploring statistical weight estimates for mitochondrial DNA matches involving heteroplasmy

Int J Legal Med. 2022 May;136(3):671-685. doi: 10.1007/s00414-022-02774-5. Epub 2022 Mar 4.

Abstract

Massively parallel sequencing (MPS) of mitochondrial (mt) DNA allows forensic laboratories to report heteroplasmy on a routine basis. Statistical approaches will be needed to determine the relative frequency of observing an mtDNA haplotype when including the presence of a heteroplasmic site. Here, we examined 1301 control region (CR) sequences, collected from individuals in four major population groups (European, African, Asian, and Latino), and covering 24 geographically distributed haplogroups, to assess the rates of point heteroplasmy (PHP) on an individual and nucleotide position (np) basis. With a minor allele frequency (MAF) threshold of 2%, the data was similar across population groups, with an overall PHP rate of 37.7%, and the majority of heteroplasmic individuals (77.3%) having only one site of heteroplasmy. The majority (75.2%) of identified PHPs had an MAF of 2-10%, and were observed at 12.6% of the nps across the CR. Both the broad and phylogenetic testing suggested that in many cases the low number of observations of heteroplasmy at any one np results in a lack of statistical association. The posterior frequency estimates, which skew conservative to a degree depending on the sample size in a given haplogroup, had a mean of 0.152 (SD 0.134) and ranged from 0.031 to 0.83. As expected, posterior frequency estimates decreased in accordance with 1/n as the sample size (n) increased. This provides a proposed conservative statistical framework for assessing haplotype/heteroplasmy matches when applying an MPS technique in forensic cases and will allow for continual refinement as more data is generated, both within the CR and across the mitochondrial genome.

Keywords: Control region; Forensic mtDNA; Forensic statistics; Massively parallel sequencing; MiSeq; Rates of mtDNA heteroplasmy.

MeSH terms

  • DNA, Mitochondrial* / genetics
  • Genome, Mitochondrial*
  • Heteroplasmy
  • High-Throughput Nucleotide Sequencing / methods
  • Humans
  • Phylogeny
  • Sequence Analysis, DNA

Substances

  • DNA, Mitochondrial