Curriculum learning-based strategy for low-density archaeological mound detection from historical maps in India and Pakistan

Iban Berganzo-Besga; Hector A Orengo; Felipe Lumbreras; Aftab Alam; Rosie Campbell; Petrus J Gerrits; Jonas Gregorio de Souza; Afifa Khan; María Suárez-Moreno; Jack Tomaney; Rebecca C Roberts; Cameron A Petrie

doi:10.1038/s41598-023-38190-x

Curriculum learning-based strategy for low-density archaeological mound detection from historical maps in India and Pakistan

Sci Rep. 2023 Jul 12;13(1):11257. doi: 10.1038/s41598-023-38190-x.

Authors

Iban Berganzo-Besga¹, Hector A Orengo^{2

3}, Felipe Lumbreras⁴, Aftab Alam⁵, Rosie Campbell⁶, Petrus J Gerrits⁶, Jonas Gregorio de Souza⁷, Afifa Khan⁶, María Suárez-Moreno⁶, Jack Tomaney⁶, Rebecca C Roberts⁶, Cameron A Petrie^{6

8}

Affiliations

¹ Landscape Archaeology Research Group (GIAP), Catalan Institute of Classical Archaeology (ICAC), Pl. Rovellat s/n, 43003, Tarragona, Spain.
² Landscape Archaeology Research Group (GIAP), Catalan Institute of Classical Archaeology (ICAC), Pl. Rovellat s/n, 43003, Tarragona, Spain. horengo@icac.cat.
³ Catalan Institution for Research and Advanced Studies (ICREA), Passeig Lluís Companys 23, 08010, Barcelona, Spain. horengo@icac.cat.
⁴ Computer Science Department, Computer Vision Center, Universitat Autònoma de Barcelona, Edifici O, Campus UAB, 08193, Bellaterra, Spain.
⁵ Banaras Hindu University, Ajagara, Varanasi, Uttar Pradesh, 221005, India.
⁶ McDonald Institute for Archaeological Research, University of Cambridge, Downing St., Cambridge, CB2 3ER, UK.
⁷ Complexity and Socio-Ecological Dynamics (CaSEs) Research Group, Universitat Pompeu Fabra, Barcelona, Spain.
⁸ Department of Archaeology, University of Cambridge, Downing St., Cambridge, CB2 3DZ, UK.

Abstract

This paper presents two algorithms for the large-scale automatic detection and instance segmentation of potential archaeological mounds on historical maps. Historical maps present a unique source of information for the reconstruction of ancient landscapes. The last 100 years have seen unprecedented landscape modifications with the introduction and large-scale implementation of mechanised agriculture, channel-based irrigation schemes, and urban expansion to name but a few. Historical maps offer a window onto disappearing landscapes where many historical and archaeological elements that no longer exist today are depicted. The algorithms focus on the detection and shape extraction of mound features with high probability of being archaeological settlements, mounds being one of the most commonly documented archaeological features to be found in the Survey of India historical map series, although not necessarily recognised as such at the time of surveying. Mound features with high archaeological potential are most commonly depicted through hachures or contour-equivalent form-lines, therefore, an algorithm has been designed to detect each of those features. Our proposed approach addresses two of the most common issues in archaeological automated survey, the low-density of archaeological features to be detected, and the small amount of training data available. It has been applied to all types of maps available of the historic 1″ to 1-mile series, thus increasing the complexity of the detection. Moreover, the inclusion of synthetic data, along with a Curriculum Learning strategy, has allowed the algorithm to better understand what the mound features look like. Likewise, a series of filters based on topographic setting, form, and size have been applied to improve the accuracy of the models. The resulting algorithms have a recall value of 52.61% and a precision of 82.31% for the hachure mounds, and a recall value of 70.80% and a precision of 70.29% for the form-line mounds, which allowed the detection of nearly 6000 mound features over an area of 470,500 km², the largest such approach to have ever been applied. If we restrict our focus to the maps most similar to those used in the algorithm training, we reach recall values greater than 60% and precision values greater than 90%. This approach has shown the potential to implement an adaptive algorithm that allows, after a small amount of retraining with data detected from a new map, a better general mound feature detection in the same map.