On the reliability of DNA sequences of Ophiocordyceps sinensis in public databases

J Ind Microbiol Biotechnol. 2013 Apr;40(3-4):365-78. doi: 10.1007/s10295-012-1228-4. Epub 2013 Feb 9.

Abstract

Some DNA sequences in the International Nucleotide Sequence Databases (INSD) are erroneously annotated, which has lead to misleading conclusions in publications. Ophiocordyceps sinensis (syn. Cordyceps sinensis) is a fungus endemic to the Tibetan Plateau, and more than 100 populations covering almost its distribution area have been examined by us over recent years. In this study, using the data from authentic materials, we have evaluated the reliability of nucleotide sequences annotated as O. sinensis in the INSD. As of October 15, 2012, the INSD contained 874 records annotated as O. sinensis, including 555 records representing nuclear ribosomal DNA (63.5 %), 197 representing protein-coding genes (22.5 %), 92 representing random markers with unknown functions (10.5 %), and 30 representing microsatellite loci (3.5 %). Our analysis indicated that 39 of the 397 internal transcribed spacer entries, 27 of the 105 small subunit entries, and five of the 53 large subunit entries were incorrectly annotated as belonging to O. sinensis. For protein-coding sequences, all records of serine protease genes, the mating-type gene MAT1-2-1, the DNA lyase gene, the two largest subunits of RNA polymerase II, and elongation factor-1α gene were correct, while 14 of the 73 β-tubulin entries were indeterminate. Genetic diversity analyses using those sequences correctly identified as O. sinensis revealed significant genetic differentiation in the fungus although the extent of genetic differentiation varied with the gene. The relationship between O. sinensis and some other related fungal taxa is also discussed.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Base Sequence
  • Cordyceps / classification
  • Cordyceps / genetics*
  • DNA, Fungal / chemistry*
  • Databases, Nucleic Acid* / statistics & numerical data
  • Genetic Variation
  • Molecular Sequence Annotation
  • Reproducibility of Results
  • Sequence Analysis, DNA

Substances

  • DNA, Fungal