Large-scale lexical and genetic alignment supports a hybrid model of Han Chinese demic and cultural diffusions

Nat Hum Behav. 2024 May 13. doi: 10.1038/s41562-024-01886-9. Online ahead of print.

Abstract

The Han Chinese history is shaped by substantial demographic activities and sociocultural transmissions. However, it remains challenging to assess the contributions of demic and cultural diffusion to Han culture and language, primarily due to the lack of rigorous examination of genetic-linguistic congruence. Here we digitized a large-scale linguistic inventory comprising 1,018 lexical traits across 926 dialect varieties. Using phylogenetic analysis and admixture inference, we revealed a north-south gradient of lexical differences that probably resulted from historical migrations. Furthermore, we quantified extensive horizontal language transfers and pinpointed central China as a dialectal melting pot. Integrating genetic data from 30,408 Han Chinese individuals, we compared the lexical and genetic landscapes across 26 provinces. Our results support a hybrid model where demic diffusion predominantly impacts central China, while cultural diffusion and language assimilation occur in southwestern and coastal regions, respectively. This interdisciplinary study sheds light on the complex social-genetic history of the Han Chinese.