KATK: Fast genotyping of rare variants directly from unmapped sequencing reads

Hum Mutat. 2021 Jun;42(6):777-786. doi: 10.1002/humu.24197. Epub 2021 Apr 1.

Abstract

KATK is a fast and accurate software tool for calling variants directly from raw next-generation sequencing reads. It uses predefined k-mers to retrieve only the reads of interest from the FASTQ file and calls genotypes by aligning retrieved reads locally. KATK does not use data about known polymorphisms and has NC (no call) as the default genotype. The reference or variant allele is called only if there is sufficient evidence for their presence in data. Thus it is not biased against rare variants or de-novo mutations. With simulated datasets, we achieved a false-negative rate of 0.23% (sensitivity 99.77%) and a false discovery rate of 0.19%. Calling all human exonic regions with KATK requires 1-2 h, depending on sequencing coverage.

Keywords: de-novo mutations; k-mers; mutation discovery; next-generation sequencing; rare mutations.

Publication types

  • Research Support, Non-U.S. Gov't
  • Validation Study

MeSH terms

  • Algorithms
  • Alleles
  • Chromosome Mapping / methods
  • DNA Mutational Analysis / methods*
  • Datasets as Topic
  • Female
  • Genome, Human
  • Genotype
  • High-Throughput Nucleotide Sequencing / methods*
  • Humans
  • Male
  • Polymorphism, Single Nucleotide
  • Reproducibility of Results
  • Sequence Analysis, DNA / methods
  • Software*