Systematic selection of ancestry informative SNPs for differentiating Han, Japanese, Dai, and Kinh populations

Electrophoresis. 2023 Sep;44(17-18):1405-1413. doi: 10.1002/elps.202200292. Epub 2023 Jun 16.

Abstract

Biogeographical origin inferences of different populations can provide valuable clues in the forensic investigation by narrowing down the detection scope. However, much research mainly focuses on forensic ancestral origin analyses of major continental populations, which may provide limited information in forensic practice. To improve the ancestral resolution of East Asian populations, we systematically selected ancestry informative single-nucleotide polymorphisms (AISNPs) for differentiating Han, Dai, Japanese, and Kinh populations. In addition, we evaluated the performance of the selected AISNPs to differentiate these populations via multiple methods. Totally 116 AISNPs were selected from the genome-wide data to infer the population origins of these four populations. Results of principle component analysis and population genetic structure of these populations indicated that the selected 116 AISNPs could achieve ancestral resolution of most individuals. Furthermore, the machine learning model built by 116 AISNPs unveiled that most individuals from these four populations could be assigned to correct population origins. To sum up, the selected 116 SNPs could be available for ancestral origin predictions of Han, Dai, Japanese, and Kinh populations, which could provide valuable information for forensic research and genome-wide association study in East Asian populations to some extent.

Keywords: East Asian; SNPs; ancestry informative markers; forensic ancestral origins; machine learning.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • East Asian People*
  • Gene Frequency
  • Genetics, Population
  • Genome-Wide Association Study
  • Genotype
  • Humans
  • Polymorphism, Single Nucleotide* / genetics
  • Racial Groups / genetics