Population genetic analysis of Shaanxi male Han Chinese population reveals genetic differentiation and homogenization of East Asians

Mol Genet Genomic Med. 2020 May;8(5):e1209. doi: 10.1002/mgg3.1209. Epub 2020 Mar 12.

Abstract

Background: Shaanxi province, located in the upper Yellow River, has been evidenced as the geographic origin of Chinese civilization, Sino-Tibetan-speaking language, and foxtail or broomcorn millet farmers via the linguistic phylogenetic spectrum, archeological documents, and genetic evidence. Nowadays, Han Chinese is the dominant population in this area. The formation process of modern Shaanxi Han population reconstructed via the ancient DNA is on the way, however, the patterns of genetic relationships of modern Shaanxi Han, allele frequency distributions of high mutated short tandem repeats (STRs) and corresponding forensic parameters are remained to be explored.

Methods: Here, we successfully genotyped 23 autosomal STRs in 630 unrelated Shaanxi male Han individuals using the recently updated Huaxia Platinum PCR amplification system. Forensic allele frequency and parameters of all autosomal STRs were assessed. And comprehensive population genetic structure was explored via various typical statistical technologies.

Results: Population genetic analysis based on the raw-genotype dataset among 15,803 Eurasian individuals and frequency datasets among 56 populations generally illustrated that linguistic stratification is significantly associated with the genetic substructure of the East Asian population. Principal component analysis, multidimensional scaling plots and phylogenetic tree further demonstrated that Shaanxi Han has a close genetic relationship with geographically close Shanxi Han, and showed that Han Chinese is a homogeneous population during the historic and recent admixture from the STR variations. Except for Sinitic-speaking populations, Shaanxi Han harbored more alleles sharing with Tibeto-Burman-speaking populations than with other reference populations. Focused on the allele frequency correlation and forensic parameters, all loci are in accordance with the minimum requirements of HWE and LD. The observed combined probability of discrimination of 8.2201E-28 and the cumulative power of exclusion of 0.9999999995 in Shaanxi Han demonstrated that the studied STR loci are informative and polymorphic, and this system can be used as a powerful routine forensic tool in personal identification and parentage testing.

Conclusion: Both the geographical and linguistic divisions have shaped the genetic structure of modern East Asian. And more forensic reference data should be obtained for ethnically, culturally, geographically and linguistically different populations for better routine forensic practice and population genetic studies.

Keywords: Han Chinese; forensic science; genetic differentiation; population genetics; short tandem repeats.

Publication types

  • Research Support, Non-U.S. Gov't
  • Retracted Publication

MeSH terms

  • China
  • Ethnicity / genetics
  • Evolution, Molecular*
  • Forensic Genetics / methods
  • Genotyping Techniques / methods
  • Humans
  • Male
  • Microsatellite Repeats
  • Pedigree
  • Polymorphism, Genetic*
  • Population / genetics*