An automated independent workflow for the analysis of massively parallel sequence data from forensic SNP assays

Electrophoresis. 2018 Nov;39(21):2752-2756. doi: 10.1002/elps.201800085. Epub 2018 Aug 6.

Abstract

Illumina and Thermo Fisher Scientific have developed assays that permit the sequencing of forensically relevant single nucleotide polymorphisms (SNPs), along with software to determine the associated genotypes. Currently there is no method to either independently confirm the genotypes determined using the manufacturer's software, or to compare genotypes and quality metrics among samples processed using both platforms. This paper outlines an automated workflow developed in CLC Genomics Workbench that permits accurate, fast and independent analysis of SNP sequence data from either platform. To facilitate the straightforward comparison of genotypes generated from both the manufacturer's software and the independent CLC analysis, a Python script was written. Data for a total of 323 forensically relevant ancestry, identity and phenotypic SNPs can be analyzed, and the resulting genotypes, coverage, quality flags and major allele frequencies are easily compared across samples and platforms.

Keywords: CLC Genomics Workbench; Forensics; Platform agnostic workflow; Python; Single nucleotide polymorphisms.

Publication types

  • Research Support, U.S. Gov't, Non-P.H.S.

MeSH terms

  • Forensic Genetics / methods
  • Gene Frequency
  • Genomics / methods*
  • High-Throughput Nucleotide Sequencing / methods*
  • Humans
  • Polymorphism, Single Nucleotide*
  • Software
  • Workflow