A framework for high-throughput sequence alignment using real processing-in-memory systems

Bioinformatics. 2023 May 4;39(5):btad155. doi: 10.1093/bioinformatics/btad155.

Abstract

Motivation: Sequence alignment is a memory bound computation whose performance in modern systems is limited by the memory bandwidth bottleneck. Processing-in-memory (PIM) architectures alleviate this bottleneck by providing the memory with computing competencies. We propose Alignment-in-Memory (AIM), a framework for high-throughput sequence alignment using PIM, and evaluate it on UPMEM, the first publicly available general-purpose programmable PIM system.

Results: Our evaluation shows that a real PIM system can substantially outperform server-grade multi-threaded CPU systems running at full-scale when performing sequence alignment for a variety of algorithms, read lengths, and edit distance thresholds. We hope that our findings inspire more work on creating and accelerating bioinformatics algorithms for such real PIM systems.

Availability and implementation: Our code is available at https://github.com/safaad/aim.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Algorithms*
  • Computational Biology
  • High-Throughput Nucleotide Sequencing
  • Sequence Alignment
  • Sequence Analysis, DNA
  • Software*