Efficient development of highly polymorphic microsatellite markers based on polymorphic repeats in transcriptome sequences of multiple individuals

M Vukosavljev; G D Esselink; W P C van 't Westende; P Cox; R G F Visser; P Arens; M J M Smulders

doi:10.1111/1755-0998.12289

Efficient development of highly polymorphic microsatellite markers based on polymorphic repeats in transcriptome sequences of multiple individuals

Mol Ecol Resour. 2015 Jan;15(1):17-27. doi: 10.1111/1755-0998.12289. Epub 2014 Jun 28.

Authors

M Vukosavljev¹, G D Esselink, W P C van 't Westende, P Cox, R G F Visser, P Arens, M J M Smulders

Affiliation

¹ Wageningen UR Plant Breeding, Wageningen University & Research Centre, P.O. Box 386, NL-6700AJ, Wageningen, the Netherlands; C.T. de Wit Graduate School for Production Ecology and Resource Conservation (PE&RC), Wageningen, the Netherlands.

PMID: 24893879
DOI: 10.1111/1755-0998.12289

Abstract

The first hurdle in developing microsatellite markers, cloning, has been overcome by next-generation sequencing. The second hurdle is testing to differentiate polymorphic from nonpolymorphic loci. The third hurdle, somewhat hidden, is that only polymorphic markers with a large effective number of alleles are sufficiently informative to be deployed in multiple studies. Both steps are laborious and still performed manually. We have developed a strategy in which we first screen reads from multiple genotypes for repeats that show the most length variants, and only these are subsequently developed into markers. We validated our strategy in tetraploid garden rose using Illumina paired-end transcriptome sequences of 11 roses. Of 48 tested two markers failed to amplify, but all others were polymorphic. Ten loci amplified more than one locus, indicating duplicated genes or gene families. Completely avoiding duplicated loci will be difficult because the range of numbers of predicted alleles of highly polymorphic single- and multilocus markers largely overlapped. Of the remainder, half were replicate markers (i.e. multiple primer pairs for one locus), indicating the difficulty of correctly filtering short reads containing repeat sequences. We subsequently refined the approach to eliminate multiple primer sets to the same loci. The remaining 18 markers were all highly polymorphic, amplifying on average 11.7 alleles per marker (range = 6-20) in 11 tetraploid roses, exceeding the 8.2 alleles per marker of the 24 most polymorphic markers genotyped previously. This strategy therefore represents a major step forward in the development of highly polymorphic microsatellite markers.

Keywords: RNA-seq; microsatellite marker; next-generation sequencing; simple sequence repeat.

Publication types

Research Support, Non-U.S. Gov't
Validation Study

MeSH terms

Computational Biology / methods*
Genetic Variation*
Genotyping Techniques / methods*
Microsatellite Repeats*
Molecular Sequence Data
Repetitive Sequences, Nucleic Acid*
Rosa / classification
Rosa / genetics
Sequence Analysis, DNA
Transcriptome*

Associated data

GENBANK/HG934830
GENBANK/HG934831
GENBANK/HG934832
GENBANK/HG934834
GENBANK/HG934835
GENBANK/HG934836
GENBANK/HG934837
GENBANK/HG934838
GENBANK/HG934840
GENBANK/HG934843
GENBANK/HG934844
GENBANK/HG934845
GENBANK/HG934846
GENBANK/HG934848
GENBANK/HG934849
GENBANK/HG934850
GENBANK/HG934851
GENBANK/LK392375