A Boolean algebra for genetic variants

Bioinformatics. 2023 Jan 1;39(1):btad001. doi: 10.1093/bioinformatics/btad001.

Abstract

Motivation: Beyond identifying genetic variants, we introduce a set of Boolean relations, which allows for a comprehensive classification of the relations of every pair of variants by taking all minimal alignments into account. We present an efficient algorithm to compute these relations, including a novel way of efficiently computing all minimal alignments within the best theoretical complexity bounds.

Results: We show that these relations are common, and many non-trivial, for variants of the CFTR gene in dbSNP. Ultimately, we present an approach for the storing and indexing of variants in the context of a database that enables efficient querying for all these relations.

Availability and implementation: A Python implementation is available at https://github.com/mutalyzer/algebra/tree/v0.2.0 as well as an interface at https://mutalyzer.nl/algebra.

MeSH terms

  • Algorithms*
  • Data Management*
  • Databases, Factual
  • Software