On the critical evaluation and confirmation of germline sequence variants identified using massively parallel sequencing

J Biotechnol. 2019 Jun 10:298:64-75. doi: 10.1016/j.jbiotec.2019.04.013. Epub 2019 Apr 15.

Abstract

Although massively parallel sequencing (MPS) is becoming common practice in both research and routine clinical care, confirmation requirements of identified DNA variants using alternative methods are still topics of debate. When evaluating variants directly from MPS data, different read depth statistics, together with specialized genotype quality scores are, therefore, of high relevance. Here we report results of our validation study performed in two different ways: 1) confirmation of MPS identified variants using Sanger sequencing; and 2) simultaneous Sanger and MPS analysis of exons of selected genes. Detailed examination of false-positive and false-negative findings revealed typical error sources connected to low read depth/coverage, incomplete reference genome, indel realignment problems, as well as microsatellite associated amplification errors leading to base miss-calling. However, all these error types were identifiable with thorough manual revision of aligned reads according to specific patterns of distributions of variants and their corresponding reads. Moreover, our results point to dependence of both basic quantitative metrics (such as total read counts, alternative allele read counts and allelic balance) together with specific genotype quality scores on the used bioinformatics pipeline, stressing thus the need for establishing of specific thresholds for these metrics in each laboratory and for each involved pipeline independently.

Keywords: Genomic diagnostics; Massively parallel sequencing; Sanger sequencing; Sequence variant; Sequence variant confirmation.

MeSH terms

  • DNA / genetics*
  • Exons / genetics
  • Genetic Variation / genetics
  • Genome, Human / genetics*
  • Genotype
  • Germ Cells*
  • High-Throughput Nucleotide Sequencing / methods*
  • Humans
  • Polymorphism, Single Nucleotide / genetics
  • Sequence Analysis, DNA / methods
  • Software

Substances

  • DNA