Rotate: A command-line program to rotate circular DNA sequences to start at a given position or string

Wellcome Open Res. 2023 Sep 13:8:401. doi: 10.12688/wellcomeopenres.19568.1. eCollection 2023.

Abstract

Sequences derived from circular DNA molecules (i.e. most bacterial, viral and plastid genomes) are expected to be linearised and rotated to a common start position for most downstream analyses including alignment. Despite this being a common and straightforward task, available software is either limited to a small number of input sequences, lacks the option to specify a custom anchor string, or requires a commercial license. Here, we present rotate, a simple, open source command line program written in C with no external dependencies, which can rotate a set of input sequences to a custom anchor string (allowing for a specified number of mismatches), or offset the input sequences to the desired position. The combination of both functionalities allows the rotation of all input sequences to any desired starting position, enabling downstream analysis. rotate is extremely fast and scales linearly with the number of input sequences, taking only seconds to rotate over a thousand mitochondrial sequences.

Keywords: Genetics; bioinformatics; circular DNA; mitochondrial DNA; plastid DNA.