Polly: An R package for genotyping microsatellites and detecting highly polymorphic DNA markers from short-read data

Mol Ecol Resour. 2024 May;24(4):e13933. doi: 10.1111/1755-0998.13933. Epub 2024 Feb 1.

Abstract

Highly polymorphic markers, such as microsatellites, are invaluable for the study of natural populations. However, contemporary methods for genotyping highly polymorphic variants have serious drawbacks that impede their efficiency. We created Polly, an R package with C++ source code that uses Illumina short-read data to genotype microsatellites, detect highly polymorphic variants and identify clusters of highly polymorphic SNPs, indels and microsatellites. We tested Polly on short-read data from Xiphophorus birchmanni (Teleostei: Poeciliidae) and Arabidopsis thaliana, finding it to be efficient and accurate both for microsatellite genotyping and polymorphic marker detection. This program can be applied to any diploid population for which there exists short-read data and at least one scaffolded reference genome.

Keywords: conservation genetics; ecological genetics; genotype; microsatellite; population genetics–empirical.

MeSH terms

  • Genetic Markers
  • Genome*
  • Genotype
  • Microsatellite Repeats
  • Polymorphism, Single Nucleotide*

Substances

  • Genetic Markers

Associated data

  • RefSeq/SRR18089322
  • RefSeq/SRR6511930
  • RefSeq/SRR5170591
  • RefSeq/SRR5172867
  • RefSeq/SRR6509133
  • RefSeq/SRR6511793
  • RefSeq/SRR6511812
  • RefSeq/SRR6511844
  • RefSeq/SRR6511845
  • RefSeq/SRR6511861
  • RefSeq/SRR6511862
  • RefSeq/SRR6511931
  • RefSeq/SRR6511932
  • RefSeq/SRR6511933
  • RefSeq/SRR6511934
  • RefSeq/SRR6511970
  • RefSeq/SRR6511971
  • RefSeq/SRR6511973
  • RefSeq/SRR6511974
  • RefSeq/SRR6511863