Microbiome maps: Hilbert curve visualizations of metagenomic profiles

Front Bioinform. 2023 Jun 19:3:1154588. doi: 10.3389/fbinf.2023.1154588. eCollection 2023.

Abstract

Abundance profiles from metagenomic sequencing data synthesize information from billions of sequenced reads coming from thousands of microbial genomes. Analyzing and understanding these profiles can be a challenge since the data they represent are complex. Particularly challenging is their visualization, as existing techniques are inadequate when the taxa number is in the thousands. We present a technique, and accompanying software, for the visualization of metagenomic abundance profiles using a space-filling curve that transforms a profile into an interactive 2D image. We created Jasper, an easy to use tool for the visualization and exploration of metagenomic profiles from DNA sequencing data. It orders taxa using a space-filling Hilbert curve, and creates a "Microbiome Map", where each position in the image represents the abundance of a single taxon from a reference collection. Jasper can order taxa in multiple ways, and the resulting microbiome maps can highlight "hot spots" of microbes that are dominant in taxonomic clades or biological conditions. We use Jasper to visualize samples from a variety of microbiome studies, and discuss ways in which microbiome maps can be an invaluable tool to visualize spatial, temporal, disease, and differential profiles. Our approach can create detailed microbiome maps involving hundreds of thousands of microbial reference genomes with the potential to unravel latent relationships (taxonomic, spatio-temporal, functional, and other) that could remain hidden using traditional visualization techniques. The maps can also be converted into animated movies that bring to life the dynamicity of microbiomes.

Keywords: DNA sequencing; Hilbert curve; image analysis; maps; metagenomics; microbiome; profiling; visualization.

Grants and funding

The work of CV was performed in part under the auspices of the U.S. Department of Energy by Lawrence Livermore National Laboratory under Contract DE-AC52-07NA27344. The work of CV was also funded while at the University of Nebraska-Lincoln through a University of Nebraska Program of Excellence award, as well as the University of Nebraska-Lincoln (UNL) Quantitative Life Sciences Initiative. An FIU Dissertation Year Fellowship partially supported the work of CV and DR-P at FIU, and the work of DR-P, and JP were done while they were at FIU.