UMI-Varcal: A Low-Frequency Variant Caller for UMI-Tagged Paired-End Sequencing Data

Methods Mol Biol. 2022:2493:235-245. doi: 10.1007/978-1-0716-2293-3_14.

Abstract

The rapid transition from traditional sequencing methods to Next-Generation Sequencing (NGS) has allowed for a faster and more accurate detection of somatic variants (Single-Nucleotide Variant (SNV) and Copy Number Variation (CNV)) in tumor cells. NGS technologies require a succession of steps during which false variants can be silently added at low frequencies. Filtering these artifacts can be a rather difficult task especially when the experiments are designed to look for very low frequency variants. Recently, adding unique molecular barcodes called UMI (Unique Molecular Identifier) to the DNA fragments appears to be a very effective strategy to specifically filter out false variants from the variant calling results (Kukita et al. DNA Res 22(4):269-277, 2015; Newman et al. Nat Biotechnol 34(5):547-555, 2016; Schmitt et al. Proc Natl Acad Sci U S A 109(36):14508-14513). Here, we describe UMI-VarCal (Sater et al. Bioinformatics 36:2718-2724, 2020), which can use the UMI information from UMI-tagged reads to offer a faster and more accurate variant calling analysis.

Keywords: Bioinformatics; NGS; Sequencing; UMI; Variant calling.

MeSH terms

  • Artifacts
  • Computational Biology
  • DNA / genetics
  • DNA Copy Number Variations*
  • High-Throughput Nucleotide Sequencing* / methods

Substances

  • DNA