An evaluation of errors in the mitochondrial COI sequences of Hydrachnidia (Acari, Parasitengona) in public databases

Exp Appl Acarol. 2022 Mar;86(3):371-384. doi: 10.1007/s10493-022-00703-0. Epub 2022 Feb 25.

Abstract

Public molecular databases are fundamental tools for modern taxonomic studies whose usefulness rely on the soundness of the data within them. Here, we study potential errors that can arise along the data pipeline from sampling, specimen identification and molecular processing (digestion, amplification and sequencing) to the submission of sequences to these databases by using the DNA sequences of Hydrachnidia (Acari, Parasitengona) as a case study. Our results indicate that molecular information is available for only about 3% of the Hydrachnidia species known to date; yet, within this small percentage, errors are present in almost 5% of the species analyzed (0.5% of the sequences and almost 11% of the genera). This study underscores the scarcity of genetic data available for Hydrachnidia, but also that the proportion of errors in DNA sequences is relatively small. Even so, it highlights the danger associated with using DNA sequences from public databases, particularly for species identification, and reinforces the need for greater quality control measures and/or protocols to avoid an intensification of errors in the (post) genomics era. Finally, our study emphasizes that potential errors may also reveal cryptic diversity within a species.

Keywords: BOLD; Cryptic diversity; GenBank; Phylogeny; Species identification; Water mites.

MeSH terms

  • Animals
  • DNA Barcoding, Taxonomic
  • Mites* / genetics
  • Phylogeny