SQMtools: automated processing and visual analysis of 'omics data with R and anvi'o

BMC Bioinformatics. 2020 Aug 14;21(1):358. doi: 10.1186/s12859-020-03703-2.

Abstract

Background: The dramatic decrease in sequencing costs over the last decade has boosted the adoption of high-throughput sequencing applications as a standard tool for the analysis of environmental microbial communities. Nowadays even small research groups can easily obtain raw sequencing data. After that, however, non-specialists are faced with the double challenge of choosing among an ever-increasing array of analysis methodologies, and navigating the vast amounts of results returned by these approaches.

Results: Here we present a workflow that relies on the SqueezeMeta software for the automated processing of raw reads into annotated contigs and reconstructed genomes (bins). A set of custom scripts seamlessly integrates the output into the anvi'o analysis platform, allowing filtering and visual exploration of the results. Furthermore, we provide a software package with utility functions to expose the SqueezeMeta results to the R analysis environment.

Conclusions: Altogether, our workflow allows non-expert users to go from raw sequencing reads to custom plots with only a few powerful, flexible and well-documented commands.

Keywords: Automatic; Metagenomics; Metatranscriptomics; Microbial ecology; Pipeline; Visualization.

MeSH terms

  • Computational Biology / methods*
  • Contig Mapping
  • Databases, Factual
  • High-Throughput Nucleotide Sequencing
  • Metagenomics
  • Software*