Towards eradicating the nuisance of numts and noise in molecular biodiversity assessment

Mol Ecol Resour. 2021 Aug;21(6):1755-1758. doi: 10.1111/1755-0998.13414. Epub 2021 Jul 1.

Abstract

DNA metabarcoding is a popular methodology for biodiversity assessment and increasingly used for community level analysis of intraspecific genetic diversity. The evolutionary history of hundreds of specimens can be captured in a single collection vial. However, the method is not without pitfalls, which may inflate or misrepresent recovered diversity metrics. Nuclear pseudogene copies of mitochondrial DNA (numts) have been particularly difficult to control because they can evolve rapidly and appear deceptively similar to true mitochondrial sequences. While the problem of numts has long been recognized for traditional sequencing approaches, the issues they create are particularly evident in metabarcoding in which the identity of individual specimens is generally not known. In this issue of Molecular Ecology Resources, Andújar et al. (2021) provide an easy to implement bioinformatic approach to reduce erroneous sequences due to numts and residual noise in metabarcoding data sets. The metaMATE software designates input sequences as authentic (mtDNA haplotypes) or nonauthentic (numts and erroneous sequences) by comparison to reference data and by analysing nucleotide substitution patterns. Filtering is applied over a range of abundance thresholds and the choice to proceed with a more rigid or less strict sequence removal strategy is at the researchers' discretion. This is a valuable addition to a growing number of complementary tools for improving the reliability of modern biodiversity monitoring.

Publication types

  • News

MeSH terms

  • Biodiversity
  • Cell Nucleus*
  • DNA, Mitochondrial*
  • Haplotypes
  • Phylogeny
  • Reproducibility of Results
  • Sequence Analysis, DNA

Substances

  • DNA, Mitochondrial