An Event-Driven Approach to Genotype Imputation on a Custom RISC-V Cluster

IEEE/ACM Trans Comput Biol Bioinform. 2024 Jan-Feb;21(1):26-35. doi: 10.1109/TCBB.2023.3328714. Epub 2024 Feb 5.

Abstract

This article proposes an event-driven solution to genotype imputation, a technique used to statistically infer missing genetic markers in DNA. The work implements the widely accepted Li and Stephens model, primary contributor to the computational complexity of modern x86 solutions, in an attempt to determine whether further investigation of the application is warranted in the event-driven domain. The model is implemented using graph-based Hidden Markov Modeling and executed as a customized forward/backward dynamic programming algorithm. The solution uses an event-driven paradigm to map the algorithm to thousands of concurrent cores, where events are small messages that carry both control and data within the algorithm. The design of a single processing element is discussed. This is then extended across multiple cores and executed on a custom RISC-V NoC cluster called POETS. Results demonstrate how the algorithm scales over increasing hardware resources and a multi-core run demonstrates a 270X reduction in wall-clock processing time when compared to a single-threaded x86 solution. Optimisation of the algorithm via linear interpolation is then introduced and tested, with results demonstrating a wall-clock reduction time of ∼ 5 orders of magnitude when compared to a similarly optimised x86 solution.

MeSH terms

  • Algorithms*
  • Computers
  • Genotype
  • Software*