Bioinformatic Amplicon Read Processing Strategies Strongly Affect Eukaryotic Diversity and the Taxonomic Composition of Communities

PLoS One. 2015 Jun 5;10(6):e0130035. doi: 10.1371/journal.pone.0130035. eCollection 2015.

Abstract

Amplicon read sequencing has revolutionized the field of microbial diversity studies. The technique has been developed for bacterial assemblages and has undergone rigorous testing with mock communities. However, due to the great complexity of eukaryotes and the numbers of different rDNA copies, analyzing eukaryotic diversity is more demanding than analyzing bacterial or mock communities, so studies are needed that test the methods of analyses on taxonomically diverse natural communities. In this study, we used 20 samples collected from the Baltic Sea ice, slush and under-ice water to investigate three program packages (UPARSE, mothur and QIIME) and 18 different bioinformatic strategies implemented in them. Our aim was to assess the impact of the initial steps of bioinformatic strategies on the results when analyzing natural eukaryotic communities. We found significant differences among the strategies in resulting read length, number of OTUs and estimates of diversity as well as clear differences in the taxonomic composition of communities. The differences arose mainly because of the variable number of chimeric reads that passed the pre-processing steps. Singleton removal and denoising substantially lowered the number of errors. Our study showed that the initial steps of the bioinformatic amplicon read processing strategies require careful consideration before applying them to eukaryotic communities.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Baltic States
  • Biodiversity
  • Computational Biology / methods*
  • DNA, Ribosomal / chemistry
  • DNA, Ribosomal / genetics
  • Ecosystem
  • Eukaryota / classification
  • Eukaryota / genetics*
  • Eukaryota / growth & development
  • Eukaryotic Cells / classification
  • Eukaryotic Cells / metabolism*
  • Genetic Variation
  • High-Throughput Nucleotide Sequencing / methods*
  • Ice
  • Ice Cover
  • Polymerase Chain Reaction / methods*
  • RNA, Ribosomal, 18S / genetics
  • Seawater

Substances

  • DNA, Ribosomal
  • Ice
  • RNA, Ribosomal, 18S

Grants and funding

This work was supported by Walter and Andrée de Nottbeck Foundation (MM), University of Helsinki 3-year grant (JB, MM, KH) and Onni Talas Foundation (KH). The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.