[Russian genofond. Genogeography of surnames]

Genetika. 2001 Jul;37(7):974-90.
[Article in Russian]

Abstract

Surnames are traditionally used in population genetics as "quasi-genetic" markers (i.e., analogs of genes) when studying the structure of the gene pool and the factors of its microevolution. In this study, spatial variation of Russian surnames was analyzed with the use of computer-based gene geography. Gene geography of surnames was demonstrated to be promising for population studies on the total Russian gene pool. Frequencies of surnames were studied in 64 sel'sovets (rural communities; a total of 33 thousand persons) of 52 raions (districts) of 22 oblasts (regions) of the European part of Russia. For each of 75 widespread surnames, an electronic map of its frequency was constructed. Summary maps of principal components were drawn based on all maps of individual surnames. The first 5 of 75 principal components accounted for half of the total variance, which indicates high resolving power of surnames. The map of the first principal component exhibits a trend directed from the northwestern to the eastern regions of the area studied. The trend of the second component was directed from the southwestern to the northern regions of the area studied, i.e., it was close to latitudinal. This trend almost coincided with the latitudinal trend of principal components for three sets of data (genetic, anthropological, and dermatoglyphical). Therefore, the latitudinal trend may be considered the main direction of variation of the Russian gene pool. The similarity between the main scenarios for the genetic and quasi-genetic markers demonstrates the effectiveness of the use of surnames for analysis of the Russian gene pool. In view of the dispute between R. Sokal and L.L. Cavalli-Sforza about the effects of false correlations, the maps of principal components of Russian surnames were constructed by two methods: through analysis of maps and through direct analysis of original data on the frequencies of surnames. An almost complete coincidence of these maps (correlation coefficient rho = 0.96) indicates that, taking into account the reliability of the data, the resultant maps of principal components have no errors of false correlations.

Publication types

  • English Abstract

MeSH terms

  • Gene Pool*
  • Genetic Markers
  • Genetics, Population
  • Humans
  • Names*
  • Russia

Substances

  • Genetic Markers