A practical guide to amplicon and metagenomic analysis of microbiome data

Protein Cell. 2021 May;12(5):315-330. doi: 10.1007/s13238-020-00724-8. Epub 2020 May 11.

Abstract

Advances in high-throughput sequencing (HTS) have fostered rapid developments in the field of microbiome research, and massive microbiome datasets are now being generated. However, the diversity of software tools and the complexity of analysis pipelines make it difficult to access this field. Here, we systematically summarize the advantages and limitations of microbiome methods. Then, we recommend specific pipelines for amplicon and metagenomic analyses, and describe commonly-used software and databases, to help researchers select the appropriate tools. Furthermore, we introduce statistical and visualization methods suitable for microbiome analysis, including alpha- and beta-diversity, taxonomic composition, difference comparisons, correlation, networks, machine learning, evolution, source tracing, and common visualization styles to help researchers make informed choices. Finally, a step-by-step reproducible analysis guide is introduced. We hope this review will allow researchers to carry out data analysis more effectively and to quickly select the appropriate tools in order to efficiently mine the biological significance behind the data.

Keywords: high-throughput sequencing; marker genes; metagenome; pipeline; reproducible analysis; visualization.

Publication types

  • Research Support, Non-U.S. Gov't
  • Review

MeSH terms

  • Algorithms*
  • Computational Biology
  • High-Throughput Nucleotide Sequencing*
  • Metagenome*
  • Metagenomics*
  • Microbiota / genetics*
  • Software*