Inferring Chinese surnames with Y-STR profiles

Forensic Sci Int Genet. 2018 Mar:33:66-71. doi: 10.1016/j.fsigen.2017.11.014. Epub 2017 Nov 24.

Abstract

Co-ancestry of human surnames and Y-chromosomes in most human populations and social groups suggests the possibility of inferring one from the other. However, such an intuitive perspective remains to be formally explored. In the present study, we develop two computational methods, based on cosine distance (dcos) and coalescence distance (dcoal) respectively, to infer surnames from Y-STR profiles. We also survey Y-STR variations at 15 loci for 19,009 individuals of Shandong Province in China. For a total of 266 surnames included in the data set, our methods can pinpoint to a single surname with an average accuracy of 65%, and with an average accuracy higher than 80% when providing >4 candidate surnames. We also demonstrate that increasing the sample size of surnames and the number of STR loci improves the accuracy of surname inference. Our results indicate that the 15 non-duplicated Y-STR loci contain information from which surname can be reliably inferred for Chinese populations, showing a promising application in forensics.

Keywords: Coalescence; Cosine distance; STR; Surname; Y-chromosome.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Asian People / genetics*
  • China
  • Chromosomes, Human, Y*
  • DNA Fingerprinting
  • Genetics, Population*
  • Genotype
  • Humans
  • Male
  • Microsatellite Repeats*
  • Models, Genetic
  • Names*