A Simple Algorithm for Population Classification

Sci Rep. 2016 Mar 31:6:23491. doi: 10.1038/srep23491.

Abstract

A single-nucleotide polymorphism (SNP) is a variation in the DNA sequence that occurs when a single nucleotide in the genome differs across members of the same species. Variations in the DNA sequences of humans are associated with human diseases. This makes SNPs as a key to open up the door of personalized medicine. SNP(s) can also be used for human identification and forensic applications. Compared to short tandem repeat (STR) loci, SNPs have much lower statistical testing power for individual recognition due to the fact that there are only 3 possible genotypes for each SNP marker, but it may provide sufficient information to identify the population to which a certain samples may belong. In this report, using eight SNP markers for 641 samples, we performed a standard statistical classification procedure and found that 86% of the samples could be classified accurately under a two-population model. This study suggests the potential use of SNP(s) in population classification with a small number (n ≤ 8) of genetic markers for forensic screening, biodiversity and disaster victim controlling.

MeSH terms

  • Algorithms*
  • Biodiversity
  • Forensic Sciences / methods*
  • Gene Frequency
  • Genetic Markers
  • Genetics, Population / methods*
  • Genotype
  • Genotyping Techniques
  • Humans
  • Microsatellite Repeats
  • Models, Statistical*
  • Polymorphism, Single Nucleotide*

Substances

  • Genetic Markers