Mining for SNPs and SSRs using SNPServer, dbSNP and SSR taxonomy tree

Methods Mol Biol. 2009:537:303-21. doi: 10.1007/978-1-59745-251-9_15.

Abstract

Molecular genetic markers represent one of the most powerful tools for the analysis of genomes and the association of heritable traits with underlying genetic variation. The development of high-throughput methods for the detection of single nucleotide polymorphisms (SNPs) and simple sequence repeats (SSRs) has led to a revolution in their use as molecular markers. The availability of large sequence data sets permits mining for these molecular markers, which may then be used for applications such as genetic trait mapping, diversity analysis and marker assisted selection in agriculture. Here we describe web-based automated methods for the discovery of SSRs using SSR taxonomy tree, the discovery of SNPs from sequence data using SNPServer and the identification of validated SNPs from within the dbSNP database. SSR taxonomy tree identifies pre-determined SSR amplification primers for virtually all species represented within the GenBank database. SNPServer uses a redundancy based approach to identify SNPs within DNA sequences. Following submission of a sequence of interest, SNPServer uses BLAST to identify similar sequences, CAP3 to cluster and assemble these sequences and then the SNP discovery software autoSNP to detect SNPs and insertion/deletion (indel) polymorphisms. The NCBI dbSNP database is a catalogue of molecular variation, hosting validated SNPs for several species within a public-domain archive.

MeSH terms

  • Base Sequence
  • Classification
  • Computational Biology
  • Databases, Genetic
  • Genetic Markers
  • Internet
  • Microsatellite Repeats*
  • Molecular Sequence Data
  • Polymorphism, Single Nucleotide*
  • Sequence Analysis, DNA / methods
  • Software*
  • User-Computer Interface

Substances

  • Genetic Markers