Bioinformatic tools for tRNA gene analyses in mitochondrial DNA sequence data

Data Brief. 2020 Feb 22:29:105284. doi: 10.1016/j.dib.2020.105284. eCollection 2020 Apr.

Abstract

The data presented here are related to the research article entitled "Hidden cases of tRNA genes duplication and remolding in mitochondrial genomes of amphipods" (Romanova et al., 2020) [1]. Correct tRNA gene sequence annotation in mitochondrial (mt) and nuclear genomes sometimes can be a challenging task because of the differential performances of tRNA annotation/prediction programmes. These programmes may cause false positive or false negative predictions. Moreover, additional difficulties with annotation may be caused by the presence of duplicated tRNA genes and those coding tRNAs with altered identities occurring as due to a mutation in their anticodon sequence (tRNA gene remolding/recruitment). We developed an R script automating the diagnosis of ancestor tRNA gene coding specificity regardless of anticodon sequence based on genetic distance comparison. Some of the predicted tRNA genes from the mt genomes of amphipods are presented. We also developed an R script for estimation of the best mode of sequence alignment, which was applied to determine the best alignment of tRNA genes in [1], but is also suitable for testing of any nucleotide alignment sets used in phylogenetic inferences.

Keywords: Genetic distance; Mitochondrial genomes; R script; Sequence alignment; tRNA genes.