Using strain-resolved analysis to identify contamination in metagenomics data

Microbiome. 2023 Mar 2;11(1):36. doi: 10.1186/s40168-023-01477-2.

Abstract

Background: Metagenomics analyses can be negatively impacted by DNA contamination. While external sources of contamination such as DNA extraction kits have been widely reported and investigated, contamination originating within the study itself remains underreported.

Results: Here, we applied high-resolution strain-resolved analyses to identify contamination in two large-scale clinical metagenomics datasets. By mapping strain sharing to DNA extraction plates, we identified well-to-well contamination in both negative controls and biological samples in one dataset. Such contamination is more likely to occur among samples that are on the same or adjacent columns or rows of the extraction plate than samples that are far apart. Our strain-resolved workflow also reveals the presence of externally derived contamination, primarily in the other dataset. Overall, in both datasets, contamination is more significant in samples with lower biomass.

Conclusion: Our work demonstrates that genome-resolved strain tracking, with its essentially genome-wide nucleotide-level resolution, can be used to detect contamination in sequencing-based microbiome studies. Our results underscore the value of strain-specific methods to detect contamination and the critical importance of looking for contamination beyond negative and positive controls. Video Abstract.

Keywords: Contamination; Genome-resolved metagenomics; Microbiome; Strains.

Publication types

  • Video-Audio Media
  • Research Support, N.I.H., Extramural
  • Research Support, Non-U.S. Gov't

MeSH terms

  • Biomass
  • DNA
  • DNA Contamination
  • Metagenomics*
  • Microbiota* / genetics

Substances

  • DNA