Taxonomic reliability of DNA sequences in public sequence databases: a fungal perspective

PLoS One. 2006 Dec 20;1(1):e59. doi: 10.1371/journal.pone.0000059.

Abstract

Background: DNA sequences are increasingly seen as one of the primary information sources for species identification in many organism groups. Such approaches, popularly known as barcoding, are underpinned by the assumption that the reference databases used for comparison are sufficiently complete and feature correctly and informatively annotated entries.

Methodology/principal findings: The present study uses a large set of fungal DNA sequences from the inclusive International Nucleotide Sequence Database to show that the taxon sampling of fungi is far from complete, that about 20% of the entries may be incorrectly identified to species level, and that the majority of entries lack descriptive and up-to-date annotations.

Conclusions: The problems with taxonomic reliability and insufficient annotations in public DNA repositories form a tangible obstacle to sequence-based species identification, and it is manifest that the greatest challenges to biological barcoding will be of taxonomical, rather than technical, nature.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • DNA, Fungal / genetics*
  • DNA, Ribosomal / genetics
  • Databases, Nucleic Acid*
  • Fungi / classification*
  • Fungi / genetics*
  • Phylogeny
  • Reproducibility of Results
  • Species Specificity

Substances

  • DNA, Fungal
  • DNA, Ribosomal