MultiNanopolish: refined grouping method for reducing redundant calculations in Nanopolish

Bioinformatics. 2021 Sep 9;37(17):2757-2760. doi: 10.1093/bioinformatics/btab078.

Abstract

Motivation: Compared with the second-generation sequencing technologies, the third-generation sequencing technologies allows us to obtain longer reads (average ∼10 kbps, maximum 900 kbps), but brings a higher error rate (∼15% error rate). Nanopolish is a variant and methylation detection tool based on hidden Markov model, which uses Oxford Nanopore sequencing data for signal-level analysis. Nanopolish can greatly improve the accuracy of assembly, whereas it is limited by long running time since most executive parts of Nanopolish is a serial and computationally expensive process.

Results: In this paper, we present an effective polishing tool, Multithreading Nanopolish (MultiNanopolish), which decomposes the whole process of iterative calculation in Nanopolish into small independent calculation tasks, making it possible to run this process in the parallel mode. Experimental results show that MultiNanopolish reduces running time by 50% with read-uncorrected assembler (Miniasm) and 20% with read-corrected assembler (Canu and Flye) based on 40 threads mode compared to the original Nanopolish.

Availability and implementation: MultiNanopolish is available at GitHub: https://github.com/BioinformaticsCSU/MultiNanopolish.

Supplementary information: Supplementary data are available at Bioinformatics online.