Bayesian regression versus machine learning for rapid age estimation of archaeological features identified with lidar at Angkor

Sci Rep. 2023 Oct 20;13(1):17913. doi: 10.1038/s41598-023-44875-0.

Abstract

Lidar (light-detection and ranging) has revolutionized archaeology. We are now able to produce high-resolution maps of archaeological surface features over vast areas, allowing us to see ancient land-use and anthropogenic landscape modification at previously un-imagined scales. In the tropics, this has enabled documentation of previously archaeologically unrecorded cities in various tropical regions, igniting scientific and popular interest in ancient tropical urbanism. An emerging challenge, however, is to add temporal depth to this torrent of new spatial data because traditional archaeological investigations are time consuming and inherently destructive. So far, we are aware of only one attempt to apply statistics and machine learning to remotely-sensed data in order to add time-depth to spatial data. Using temples at the well-known massive urban complex of Angkor in Cambodia as a case study, a predictive model was developed combining standard regression with novel machine learning methods to estimate temple foundation dates for undated Angkorian temples identified with remote sensing, including lidar. The model's predictions were used to produce an historical population curve for Angkor and study urban expansion at this important ancient tropical urban centre. The approach, however, has certain limitations. Importantly, its handling of uncertainties leaves room for improvement, and like many machine learning approaches it is opaque regarding which predictor variables are most relevant. Here we describe a new study in which we investigated an alternative Bayesian regression approach applied to the same case study. We compare the two models in terms of their inner workings, results, and interpretive utility. We also use an updated database of Angkorian temples as the training dataset, allowing us to produce the most current estimate for temple foundations and historic spatiotemporal urban growth patterns at Angkor. Our results demonstrate that, in principle, predictive statistical and machine learning methods could be used to rapidly add chronological information to large lidar datasets and a Bayesian paradigm makes it possible to incorporate important uncertainties-especially chronological-into modelled temporal estimates.