Algorithm for DNA sequence assembly by quantum annealing

BMC Bioinformatics. 2022 Apr 7;23(1):122. doi: 10.1186/s12859-022-04661-7.

Abstract

Background: The assembly task is an indispensable step in sequencing genomes of new organisms and studying structural genomic changes. In recent years, the dynamic development of next-generation sequencing (NGS) methods raises hopes for making whole-genome sequencing a fast and reliable tool used, for example, in medical diagnostics. However, this is hampered by the slowness and computational requirements of the current processing algorithms, which raises the need to develop more efficient algorithms. One possible approach, still little explored, is the use of quantum computing.

Results: We present a proof of concept of de novo assembly algorithm, using the Genomic Signal Processing approach, detecting overlaps between DNA reads by calculating the Pearson correlation coefficient and formulating the assembly problem as an optimization task (Traveling Salesman Problem). Computations performed on a classic computer were compared with the results achieved by a hybrid method combining CPU and QPU calculations. For this purpose quantum annealer by D-Wave was used. The experiments were performed with artificially generated data and DNA reads coming from a simulator, with actual organism genomes used as input sequences. To our knowledge, this work is one of the few where actual sequences of organisms were used to study the de novo assembly task on quantum annealer.

Conclusions: Proof of concept carried out by us showed that the use of quantum annealer (QA) for the de novo assembly task might be a promising alternative to the computations performed in the classical model. The current computing power of the available devices requires a hybrid approach (combining CPU and QPU computations). The next step may be developing a hybrid algorithm strictly dedicated to the de novo assembly task, using its specificity (e.g. the sparsity and bounded degree of the overlap-layout-consensus graph).

Keywords: De novo assembly; Hybrid algorithm; Quantum annealing; TSP; Travelling salesman problem; VRP; Vehicle routing problem.

MeSH terms

  • Algorithms
  • Base Sequence
  • Computing Methodologies*
  • DNA / genetics
  • High-Throughput Nucleotide Sequencing / methods
  • Quantum Theory*
  • Sequence Analysis, DNA / methods

Substances

  • DNA