Contact-tracing in cultural evolution: a Bayesian mixture model to detect geographic areas of language contact

J R Soc Interface. 2021 Aug;18(181):20201031. doi: 10.1098/rsif.2020.1031. Epub 2021 Aug 11.

Abstract

When speakers of different languages interact, they are likely to influence each other: contact leaves traces in the linguistic record, which in turn can reveal geographical areas of past human interaction and migration. However, other factors may contribute to similarities between languages. Inheritance from a shared ancestral language and universal preference for a linguistic property may both overshadow contact signals. How can we find geographical contact areas in language data, while accounting for the confounding effects of inheritance and universal preference? We present sBayes, an algorithm for Bayesian clustering in the presence of confounding effects. The algorithm learns which similarities are better explained by confounders, and which are due to contact effects. Contact areas are free to take any shape or size, but an explicit geographical prior ensures their spatial coherence. We test sBayes on simulated data and apply it in two case studies to reveal language contact in South America and the Balkans. Our results are supported by findings from previous studies. While we focus on detecting language contact, the method can also be used to uncover other traces of shared history in cultural evolution, and more generally, to reveal latent spatial clusters in the presence of confounders.

Keywords: Bayesian clustering; confounding; cultural evolution; language; linguistic areas; mixture model; spatial analysis.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Bayes Theorem
  • Contact Tracing
  • Cultural Evolution*
  • Humans
  • Language*
  • Linguistics