Barcode-free next-generation sequencing error validation for ultra-rare variant detection

Nat Commun. 2019 Feb 28;10(1):977. doi: 10.1038/s41467-019-08941-4.

Abstract

The advent of next-generation sequencing (NGS) has accelerated biomedical research by enabling the high-throughput analysis of DNA sequences at a very low cost. However, NGS has limitations in detecting rare-frequency variants (< 1%) because of high sequencing errors (> 0.1~1%). NGS errors could be filtered out using molecular barcodes, by comparing read replicates among those with the same barcodes. Accordingly, these barcoding methods require redundant reads of non-target sequences, resulting in high sequencing cost. Here, we present a cost-effective NGS error validation method in a barcode-free manner. By physically extracting and individually amplifying the DNA clones of erroneous reads, we distinguish true variants of frequency > 0.003% from the systematic NGS error and selectively validate NGS error after NGS. We achieve a PCR-induced error rate of 2.5×10-6 per base per doubling event, using 10 times less sequencing reads compared to those from previous studies.

Publication types

  • Research Support, Non-U.S. Gov't
  • Validation Study

MeSH terms

  • Cloning, Molecular
  • DNA Barcoding, Taxonomic
  • DNA, Bacterial / genetics
  • Escherichia coli / genetics
  • Gene Library
  • Genetic Variation*
  • High-Throughput Nucleotide Sequencing / methods*
  • High-Throughput Nucleotide Sequencing / standards
  • High-Throughput Nucleotide Sequencing / statistics & numerical data
  • Polymerase Chain Reaction
  • Quality Control
  • Sequence Analysis, DNA / methods*
  • Sequence Analysis, DNA / standards
  • Sequence Analysis, DNA / statistics & numerical data

Substances

  • DNA, Bacterial