Ancestry informative markers (AIMs) for Korean and other East Asian and South East Asian populations

Int J Legal Med. 2019 Nov;133(6):1711-1719. doi: 10.1007/s00414-019-02129-7. Epub 2019 Aug 7.

Abstract

Inference of ancestry from biological evidence can provide investigative information, especially for unknown DNA donors. Although tools for predicting ancestry have been developing, ancestry research focusing on populations relevant for South Korea is not common and markers are seldom chosen specifically to differentiate Koreans from other East Asian and South East Asian populations. Here, we report ancestry informative markers (AIMs) for distinguishing six East/South East Asian regional populations: China, Japan, Indonesia, Philippines, South Korea and Thailand. Individual genotypes from these six populations were available in PanSNPdb: The HUGO Pan-Asian SNP Database. To select AIMs, we calculated four population divergence metrics for each SNP: Nei's FST, Rosenberg's Informativeness (In), the average absolute allele frequency difference between populations (δFmean) and the maximum allele frequency difference between populations (δFmax). Based on these values, we selected 100 single nucleotide polymorphisms (SNPs) for distinguishing the six populations, 13 of which exhibited large allele frequency differences between Koreans and non-Koreans. To assess the performance of the AIMs, we performed principal coordinates analysis (PCoA) on the individuals from all six populations and inferred ancestral population clusters using the STRUCTURE program. In conclusion, we found that the selected AIMs can be applied to distinguish the six East/South East Asian groups and we suggest the markers in this study will be helpful to establish ancestry panels for Korea and neighbouring populations.

Keywords: Ancestry informative markers (AIMs); Korea; Principal coordinates analysis (PCoA); STRUCTURE; Single nucleotide polymorphisms (SNPs).

MeSH terms

  • Asia
  • Asian People / genetics*
  • DNA Fingerprinting
  • Databases, Genetic
  • Gene Frequency
  • Genetic Markers*
  • Genetics, Population*
  • Genotype
  • Humans
  • Polymorphism, Single Nucleotide*
  • Principal Component Analysis

Substances

  • Genetic Markers