Bacterial rose garden for metagenomic SNP-based phylogeny visualization

BioData Min. 2015 Mar 21:8:10. doi: 10.1186/s13040-015-0045-5. eCollection 2015.

Abstract

Background: One of the most challenging tasks in genomic analysis nowadays is metagenomics. Biomedical applications of metagenomics give rise to datasets containing hundreds and thousands of samples from various body sites for hundreds of patients. Inherently metagenome is by far more complex than a single genome as it varies in time by the amount of bacteria comprising it. Other levels of data complexity include geography of the samples and phylogenetic distance between the genomes of the same operational taxonomic unit (OTU). We have developed the visualization concept for the representation of multilayer metagenomics data - the bacterial rose garden. The approach allows to display the taxonomic distance between the representatives of the same OTU in different samples and use variety of the metadata for display.

Results: We have developed the principle of visualization allowing for multilayer information representation. We have incorporated data on OTU diversity across metagenomes and origin of the samples. The visual representation we have called "rose" is focused on the phylogenetic distance between the representatives of the same OTU. The visual representation is realized as interactive data chart which allows user to interact with data and explore variables. It is known that classical representation of the taxonomic tree is a reduction of information from original pairwise distance matrix. The visualization presented is a way to save all the information available through projection of distance matrix into single dimensional space of one sample. It could serve as a basis for further more complex information representation. We have used the principle proposed for visualization of 101 bacterial OTUs phylogenetic distances, finally we provide open code for the web page generation.

Conclusions: Bacterial rose garden is a versatile visualization principle coping with the major difficulties of metagenomic big-data visualization without loss of data. The method proposed is showing the interconnectedness of variables and is realized as user-friendly web page allowing for dynamic data exploration. The concept provided serves as one of the original approaches for metagenomic data representation and sharing. Full functional prototype could be found at http://rosegarden.datalaboratory.ru.

Keywords: Gut microbiota; Metagenomic data visualization; Phylogeny visualization; Rose garden.