Investigating microbiome and transcriptome data to uncover the key microbial community involved in lignocellulose degradation within the Deulajhari hot spring consortium

Data Brief. 2023 Oct 5:51:109648. doi: 10.1016/j.dib.2023.109648. eCollection 2023 Dec.

Abstract

Geothermally heated spring water contaminated with decomposed leaf biomass creates unique hot spring ecosystems that support the recycling of diverse nutrients and harbor microbial consortia capable of degrading lignocellulose. We present microbiome and transcriptome data from the bacterial consortium of Deulajhari hot springs, characterized by a temperature of approximately 58 °C and surrounded by a dense population of pandanus plants in Angul, Odisha, India. Metagenomics and metatranscriptomics datasets were generated by extracting total DNA and RNA from the consortium sample of hotspring sediment, followed by shotgun sequencing using the Illumina HiSeq 2500 platform. The metagenomics dataset produced approximately 38,694 contigs, while the metatranscriptomics dataset yielded 9226 contigs, resulting in a total nucleotide size of 89,857,616 and 15,541,403 bps, respectively. Analysis using MEGAN6 against the NCBI "taxonomy" database revealed the presence of 18 and 12 phyla, including candidate phyla, in respective datasets. Proteobacteria exhibited the highest relative abundance in the metagenomics dataset, while Firmicutes was highly abundant in the metatranscriptomics dataset. At the genus level, a total of 92 and 25 genera were predicted in both datasets, with lignocellulose degrading Meiothermus being highly abundant in both metagenomics and metatranscriptomics datasets. We also observed that the unknown bacteria and unidentified sequences were found in significant proportion in the metatranscriptomics dataset. We assembled and functionally annotated approximately 23,960 contigs using the Prokka pipeline. Among the SEED category, the most expressed and annotated microbial genes fall under the unknown category as well as Biotin synthesis and their utilization. Furthermore, some of these genes were implicated in the degradation of aromatic amino acids, D-mannitol, and D-mannose. These findings contribute to our understanding of how the composition and abundance of bacterial communities facilitate lignocellulose degradation in extreme environments and biofuel generation.

Keywords: Consortium; Hotspring; Lignocellulose; Metagenomics; Metatranscriptomics; Microbiome.