Gramtools enables multiscale variation analysis with genome graphs

Genome Biol. 2021 Sep 6;22(1):259. doi: 10.1186/s13059-021-02474-0.

Abstract

Genome graphs allow very general representations of genetic variation; depending on the model and implementation, variation at different length-scales (single nucleotide polymorphisms (SNPs), structural variants) and on different sequence backgrounds can be incorporated with different levels of transparency. We implement a model which handles this multiscale variation and develop a JSON extension of VCF (jVCF) allowing for variant calls on multiple references, both implemented in our software gramtools. We find gramtools outperforms existing methods for genotyping SNPs overlapping large deletions in M. tuberculosis and is able to genotype on multiple alternate backgrounds in P. falciparum, revealing previously hidden recombination.

Keywords: Genome graph; Mycobacterium tuberculosis; Pangenome; Plasmodium falciparum; VCF; Variant calling.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Algorithms*
  • Alleles
  • Antigens, Surface / metabolism
  • Computer Simulation
  • Genetic Variation*
  • Genome, Human*
  • Genotyping Techniques
  • Haplotypes / genetics
  • Humans
  • Mycobacterium tuberculosis / genetics
  • Plasmodium falciparum / genetics
  • Polymorphism, Single Nucleotide / genetics
  • Reproducibility of Results
  • Sequence Deletion

Substances

  • Antigens, Surface