Uncovering Effects from the Structure of Metabarcode Sequences for Metagenetic and Microbiome Analysis

Methods Protoc. 2020 Mar 12;3(1):22. doi: 10.3390/mps3010022.

Abstract

The advent of next-generation sequencing has allowed for higher-throughput determination of which species live within a specific location. Here we establish that three analysis methods for estimating diversity within samples-namely, Operational Taxonomic Units; the newer Amplicon Sequence Variants; and a method commonly found in sequence analysis, minhash-are affected by various properties of these sequence data. Using simulations we show that the presence of Single Nucleotide Polymorphisms and the depth of coverage from each species affect the correlations between these approaches. Through this analysis, we provide insights which would affect the decisions on the application of each method. Specifically, the presence of sequence read errors and variability in sequence read coverage deferentially affects these processing methods.

Keywords: ASV; OTU; PERMANOVA; compression; k-mer; mantel; metagenetics; microbiome; minhash.