Metagenome Proteins and Database Contamination

mSphere. 2020 Nov 4;5(6):e00854-20. doi: 10.1128/mSphere.00854-20.

Abstract

Continued influx of metagenome-derived proteins with misannotated taxonomy into conventional databases, including RefSeq, threatens to eliminate the value of taxonomy identifiers. To prevent this, urgent efforts should be undertaken by submitters of metagenomic data sets as well as by database managers.

Keywords: MAG; RefSeq; binning; classification; metagenomics; taxonomy; transposons.

MeSH terms

  • Algorithms
  • Databases, Genetic / standards*
  • Databases, Genetic / statistics & numerical data
  • Metagenome*
  • Metagenomics / methods
  • Metagenomics / standards
  • Proteins / genetics*

Substances

  • Proteins