Be aware of the allele-specific bias and compositional effects in multi-template PCR

PeerJ. 2022 Aug 30:10:e13888. doi: 10.7717/peerj.13888. eCollection 2022.

Abstract

High-throughput sequencing of amplicon libraries is the most widespread and one of the most effective ways to study the taxonomic structure of microbial communities, even despite growing accessibility of whole metagenome sequencing. Due to the targeted amplification, the method provides unparalleled resolution of communities, but at the same time perturbs initial community structure thereby reducing data robustness and compromising downstream analyses. Experimental research of the perturbations is largely limited to comparative studies on different PCR protocols without considering other sources of experimental variation related to characteristics of the initial microbial composition itself. Here we analyse these sources and demonstrate how dramatically they effect the relative abundances of taxa during the PCR cycles. We developed the mathematical model of the PCR amplification assuming the heterogeneity of amplification efficiencies and considering the compositional nature of data. We designed the experiment-five consecutive amplicon cycles (22-26) with 12 replicates for one real human stool microbial sample-and estimated the dynamics of the microbial community in line with the model. We found the high heterogeneity in amplicon efficiencies of taxa that leads to the non-linear and substantial (up to fivefold) changes in relative abundances during PCR. The analysis of possible sources of heterogeneity revealed the significant association between amplicon efficiencies and the energy of secondary structures of the DNA templates. The result of our work highlights non-trivial changes in the dynamics of real-life microbial communities due to their compositional nature. Obtained effects are specific not only for amplicon libraries, but also for any studies of metagenome dynamics.

Keywords: Amplification bias; Bayesian inference; Compositional data analysis; High-throughput sequencing; Microbiome; PCR.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Alleles
  • Bacteria*
  • Humans
  • Nucleic Acid Amplification Techniques*
  • Polymerase Chain Reaction / methods
  • RNA, Ribosomal, 16S / genetics

Substances

  • RNA, Ribosomal, 16S

Grants and funding

The study was funded by the Russian Science Foundation (RSF) grant number 18-16-00073-P. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.