Estimation of bacterial diversity using next generation sequencing of 16S rDNA: a comparison of different workflows

BMC Bioinformatics. 2011 Dec 14:12:473. doi: 10.1186/1471-2105-12-473.

Abstract

Background: Next generation sequencing (NGS) enables a more comprehensive analysis of bacterial diversity from complex environmental samples. NGS data can be analysed using a variety of workflows. We test several simple and complex workflows, including frequently used as well as recently published tools, and report on their respective accuracy and efficiency under various conditions covering different sequence lengths, number of sequences and real world experimental data from rhizobacterial populations of glyphosate-tolerant maize treated or untreated with two different herbicides representative of differential diversity studies.

Results: Alignment and distance calculations affect OTU estimations, and multiple sequence alignment exerts a major impact on the computational time needed. Generally speaking, most of the analyses produced consistent results that may be used to assess differential diversity changes, however, dataset characteristics dictate which workflow should be preferred in each case.

Conclusions: When estimating bacterial diversity, ESPRIT as well as the web-based workflow, RDP pyrosequencing pipeline, produced good results in all circumstances, however, its computational requirements can make method-combination workflows more attractive, depending on sequence variability, number and length.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Bacteria / classification*
  • Bacteria / genetics*
  • Biodiversity
  • DNA, Ribosomal / genetics
  • High-Throughput Nucleotide Sequencing / methods*
  • Humans
  • RNA, Ribosomal, 16S / genetics
  • Regression Analysis
  • Sequence Analysis, DNA / methods
  • Soil Microbiology*
  • Workflow

Substances

  • DNA, Ribosomal
  • RNA, Ribosomal, 16S