Microbial dark matter sequences verification in amplicon sequencing and environmental metagenomics data

Front Microbiol. 2023 Nov 2:14:1247119. doi: 10.3389/fmicb.2023.1247119. eCollection 2023.

Abstract

Although microorganisms constitute the most diverse and abundant life form on Earth, in many environments, the vast majority of them remain uncultured. As it is based on information gleaned mainly from cultivated microorganisms, our current body of knowledge regarding microbial life is partial and does not reflect actual microbial diversity. That diversity is hidden in the uncultured microbial majority, termed by microbiologists as "microbial dark matter" (MDM), a term borrowed from astrophysics. Metagenomic sequencing analysis techniques (both 16S rRNA gene and shotgun sequencing) compare gene sequences to reference databases, each of which represents only a small fraction of the existing microorganisms. Unaligned sequences lead to groups of "unknown microorganisms" that are usually ignored and rarefied from diversity analysis. To address this knowledge gap, we analyzed the 16S rRNA gene sequences of microbial communities from four different environments-a living organism, a desert environment, a natural aquatic environment, and a membrane bioreactor for wastewater treatment. From those datasets, we chose representative sequences of potentially unknown bacteria for additional examination as "microbial dark matter sequences" (MDMS). Sequence existence was validated by specific amplification and re-sequencing. These sequences were screened against databases and aligned to the Genome Taxonomy Database to build a comprehensive phylogenetic tree for additional sequence classification, revealing potentially new candidate phyla and other lineages. These putative MDMS were also screened against metagenome-assembled genomes from the explored environments for additional validation and for taxonomic and metabolic characterizations. This study shows the immense importance of MDMS in environmental metataxonomic analyses of 16S rRNA gene sequences and provides a simple and readily available methodology for the examination of MDM hidden behind amplicon sequencing results.

Keywords: amplicon sequencing; bacteria; metagenomics; microbial community; microbial dark matter.