MultiplexSSR: A pipeline for developing multiplex SSR-PCR assays from resequencing data

Ecol Evol. 2020 Mar 4;10(6):3055-3067. doi: 10.1002/ece3.6121. eCollection 2020 Mar.

Abstract

Next-generation sequencing has greatly promoted the investigation of single nucleotide polymorphisms, while studies of simple sequence repeats are sharply decreasing. However, simple sequence repeats still present some advantages in conservation genetics. In this study, an end-to-end pipeline referred to as MultiplexSSR was established to develop multiplex PCR assays in batches with highly polymorphic simple sequence repeats for capillary platforms from resequencing data. The distribution of single sequence repeats in the genome, the error profiles of genotypes and allelotypes, and the increase in the allele length range depending on the number of individuals were investigated. A total of 98% of single sequence repeats presented lengths of less than 100 bp. The error rate of the genotyping and allelotyping of dimeric patterns was ten times higher than those for other patterns. The error rate of allelotyping was less than that of genotyping. The allele length range reached approximate saturation with 10 individuals. This pipeline uses allele numbers to select highly polymorphic loci, masks loci with variation, and applies in silico PCR to improve primer specificity. The application of the developed multiplex SSR-PCR assays validated the pipeline's robustness, showing higher polymorphism and stability for the developed simple sequence repeats and a lower cost for genotyping and providing low-depth resequencing data from less than a dozen individuals for the development of markers. This pipeline fills the gap between next-generation sequencing and multiplex SSR-PCR.

Keywords: multiplex SSR‐PCR; pedigree construction; pipeline; resequencing.