SnakeCube: containerized and automated pipeline for de novo genome assembly in HPC environments

BMC Res Notes. 2022 Mar 7;15(1):98. doi: 10.1186/s13104-022-05978-5.

Abstract

Objective: The rapid progress in sequencing technology and related bioinformatics tools aims at disentangling diversity and conservation issues through genome analyses. The foremost challenges of the field involve coping with questions emerging from the swift development and application of new algorithms, as well as the establishment of standardized analysis approaches that promote transparency and transferability in research.

Results: Here, we present SnakeCube, an automated and containerized whole de novo genome assembly pipeline that runs within isolated, secured environments and scales for use in High Performance Computing (HPC) domains. SnakeCube was optimized for its performance and tested for its effectiveness with various inputs, highlighting its safe and robust universal use in the field.

Keywords: Assembly; Container; Genome; Pipeline; de-novo.

MeSH terms

  • Algorithms
  • Computational Biology
  • Genome* / genetics
  • High-Throughput Nucleotide Sequencing
  • Sequence Analysis, DNA
  • Software*