Analysis of large 16S rRNA Illumina data sets: Impact of singleton read filtering on microbial community description

Mol Ecol Resour. 2017 Nov;17(6):e122-e132. doi: 10.1111/1755-0998.12700. Epub 2017 Aug 3.

Abstract

Next-generation sequencing technologies give access to large sets of data, which are extremely useful in the study of microbial diversity based on 16S rRNA gene. However, the production of such large data sets is not only marred by technical biases and sequencing noise but also increases computation time and disc space use. To improve the accuracy of OTU predictions and overcome both computations, storage and noise issues, recent studies and tools suggested removing all single reads and low abundant OTUs, considering them as noise. Although the effect of applying an OTU abundance threshold on α- and β-diversity has been well documented, the consequences of removing single reads have been poorly studied. Here, we test the effect of singleton read filtering (SRF) on microbial community composition using in silico simulated data sets as well as sequencing data from synthetic and real communities displaying different levels of diversity and abundance profiles. Scalability to large data sets is also assessed using a complete MiSeq run. We show that SRF drastically reduces the chimera content and computational time, enabling the analysis of a complete MiSeq run in just a few minutes. Moreover, SRF accurately determines the actual community diversity: the differences in α- and β-community diversity obtained with SRF and standard procedures are much smaller than the intrinsic variability of technical and biological replicates.

Keywords: 16SrRNA; Illumina; metabarcoding; microbial diversity; singleton filtering.

MeSH terms

  • Cluster Analysis
  • Computational Biology / methods*
  • DNA, Ribosomal / chemistry
  • DNA, Ribosomal / genetics
  • Metagenomics / methods*
  • Microbiota*
  • Phylogeny*
  • RNA, Ribosomal, 16S / genetics
  • Sequence Analysis, DNA

Substances

  • DNA, Ribosomal
  • RNA, Ribosomal, 16S