RUBICON: a framework for designing efficient deep learning-based genomic basecallers

Genome Biol. 2024 Feb 16;25(1):49. doi: 10.1186/s13059-024-03181-2.

Abstract

Nanopore sequencing generates noisy electrical signals that need to be converted into a standard string of DNA nucleotide bases using a computational step called basecalling. The performance of basecalling has critical implications for all later steps in genome analysis. Therefore, there is a need to reduce the computation and memory cost of basecalling while maintaining accuracy. We present RUBICON, a framework to develop efficient hardware-optimized basecallers. We demonstrate the effectiveness of RUBICON by developing RUBICALL, the first hardware-optimized mixed-precision basecaller that performs efficient basecalling, outperforming the state-of-the-art basecallers. We believe RUBICON offers a promising path to develop future hardware-optimized basecallers.

Keywords: Basecalling; Deep neural network; Genomics sequencing; Hardware acceleration; Machine learning.

MeSH terms

  • DNA / genetics
  • Deep Learning*
  • Genomics
  • Nanopores*
  • Nucleotides
  • Sequence Analysis, DNA

Substances

  • Nucleotides
  • DNA