Comprehensive evaluations of individual discrimination, kinship analysis, genetic relationship exploration and biogeographic origin prediction in Chinese Dongxiang group by a 60-plex DIP panel

Hereditas. 2023 Mar 29;160(1):14. doi: 10.1186/s41065-023-00271-2.

Abstract

Background: Dongxiang group, as an important minority, resides in Gansu province which is located at the northwest China, forensic detection system with more loci needed to be studied to improve the application efficiency of forensic case investigation in this group.

Methods: A 60-plex system including 57 autosomal deletion/insertion polymorphisms (A-DIPs), 2 Y chromosome DIPs (Y-DIPs) and the sex determination locus (Amelogenin) was explored to evaluate the forensic application efficiencies of individual discrimination, kinship analysis and biogeographic origin prediction in Gansu Dongxiang group based on the 60-plex genotype results of 233 unrelated Dongxiang individuals. The 60-plex genotype results of 4582 unrelated individuals from 33 reference populations in five different continents were also collected to analyze the genetic background of Dongxiang group and its genetic relationships with other continental populations.

Results: The system showed high individual discrimination power, as the cumulative power of discrimination (CPD), cumulative power of exclusion (CPE) for trio and cumulative match probability (CMP) values were 0.99999999999999999999997297, 0.999980 and 2.7029E- 24, respectively. The system could distinguish 98.12%, 93.78%, 82.18%, 62.35% and 39.32% of full sibling pairs from unrelated individual pairs, when the likelihood ratio (LR) limits were set as 1, 10, 100, 1000 and 10,000 based on the simulated family samples, respectively. Additionally, Dongxiang group had the close genetic distances with populations in East Asia, especially showed the intimate genetic relationships with Chinese Han populations, which were concluded from the genetic affinities and genetic background analyses of Dongxiang group and 33 reference populations. In terms of the effectiveness of biogeographic origin inference, different artificial intelligent algorithms possessed different efficacies. Among them, the random forest (RF) and extreme gradient boosting (XGBoost) algorithm models could accurately predict the biogeographic origins of 99.7% and 90.59% of three and five continental individuals, respectively.

Conclusion: This 60-plex system had good performance for individual discrimination, kinship analysis and biogeographic origin prediction in Dongxiang group, which could be used as a powerful tool for case investigation.

Keywords: Artificial intelligence algorithm; Biogeographic origin prediction; Deletion/insertion polymorphism; Individual discrimination; Kinship analysis.

MeSH terms

  • China
  • East Asian People* / genetics
  • Gene Frequency
  • Genetics, Population*
  • Humans
  • Microsatellite Repeats
  • Minority Groups
  • Polymorphism, Genetic