Modeling allelic analyte signals for aSTRs in NGS DNA profiles

J Forensic Sci. 2021 Jul;66(4):1234-1245. doi: 10.1111/1556-4029.14685. Epub 2021 Feb 18.

Abstract

We describe an adaption of Bright et al.'s work modeling peak height variability in CE-DNA profiles to the modeling of allelic aSTR (autosomal short tandem repeats) read counts from NGS-DNA profiles, specifically for profiles generated from the ForenSeq™ DNA Signature Prep Kit, DNA Primer Mix B. Bright et al.'s model consists of three key components within the estimation of total allelic product-template, locus-specific amplification efficiencies, and degradation. In this work, we investigated the two mass parameters-template and locus-specific amplification efficiencies-and used MLE (maximum likelihood estimation) and MCMC (Markov chain Monte Carlo) methods to obtain point estimates to calculate the total allelic product. The expected read counts for alleles were then calculated after proportioning some of the expected stutter product from the total allelic product. Due to preferential amplicon selection introduced by the sample purification beads, degradation is difficult to model from the aSTR outputs alone. Improved modeling of the locus-specific amplification efficiencies may mask the effects of degradation. Whilst this model could be improved by introducing locus specific variances in addition to locus specific priors, our results demonstrate the suitability of adapting Bright et al.'s allele peak height model for NGS-DNA profiles. This model could be incorporated into continuous probabilistic interpretation approaches for mixed DNA profiles.

Keywords: STRs; amplification efficiency; autosomal; continuous models; massive parallel sequencing; next-generation sequencing; probabilistic genotyping.

MeSH terms

  • Alleles*
  • DNA Fingerprinting / methods*
  • High-Throughput Nucleotide Sequencing*
  • Humans
  • Likelihood Functions
  • Microsatellite Repeats*
  • Monte Carlo Method
  • Sequence Analysis, DNA*