Instances of erroneous DNA barcoding of metazoan invertebrates: Are universal cox1 gene primers too "universal"?

PLoS One. 2018 Jun 22;13(6):e0199609. doi: 10.1371/journal.pone.0199609. eCollection 2018.

Abstract

The cytochrome c oxidase subunit I (cox1) gene is the main mitochondrial molecular marker playing a pivotal role in phylogenetic research and is a crucial barcode sequence. Folmer's "universal" primers designed to amplify this gene in metazoan invertebrates allowed quick and easy barcode and phylogenetic analysis. On the other hand, the increase in the number of studies on barcoding leads to more frequent publishing of incorrect sequences, due to amplification of non-target taxa, and insufficient analysis of the obtained sequences. Consequently, some sequences deposited in genetic databases are incorrectly described as obtained from invertebrates, while being in fact bacterial sequences. In our study, in which we used Folmer's primers to amplify COI sequences of the crustacean fairy shrimp Branchipus schaefferi (Fischer 1834), we also obtained COI sequences of microbial contaminants from Aeromonas sp. However, when we searched the GenBank database for sequences closely matching these contaminations we found entries described as representatives of Gastrotricha and Mollusca. When these entries were compared with other sequences bearing the same names in the database, the genetic distance between the incorrect and correct sequences amplified from the same species was c.a. 65%. Although the responsibility for the correct molecular identification of species rests on researchers, the errors found in already published sequences data have not been re-evaluated so far. On the basis of the standard sampling technique we have estimated with 95% probability that the chances of finding incorrectly described metazoan sequences in the GenBank depend on the systematic group, and variety from less than 1% (Mollusca and Arthropoda) up to 6.9% (Gastrotricha). Consequently, the increasing popularity of DNA barcoding and metabarcoding analysis may lead to overestimation of species diversity. Finally, the study also discusses the sources of the problems with amplification of non-target sequences.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Animals
  • DNA Barcoding, Taxonomic*
  • DNA Primers*
  • DNA, Mitochondrial*
  • Databases, Nucleic Acid
  • Electron Transport Complex IV / genetics*
  • Invertebrates / genetics*
  • Phylogeny
  • Polymerase Chain Reaction

Substances

  • DNA Primers
  • DNA, Mitochondrial
  • Electron Transport Complex IV

Grants and funding

The research was supported by the Polish National Science Center grant no. NCN DEC-2011/01/N/NZ8/03649 to MJC (https://www.ncn.gov.pl/finansowanie-nauki/konkursy) and grant from the University of Gdańsk no. 538-L260-B518-17-1M to MM (https://ug.edu.pl/). The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.