Measuring sustainable tourism with online platform data

EPJ Data Sci. 2022;11(1):41. doi: 10.1140/epjds/s13688-022-00354-6. Epub 2022 Jul 18.

Abstract

Sustainability in tourism is a topic of global relevance, finding multiple mentions in the United Nations Sustainable Development Goals. The complex task of balancing tourism's economic, environmental, and social effects requires detailed and up-to-date data. This paper investigates whether online platform data can be employed as an alternative data source in sustainable tourism statistics. Using a web-scraped dataset from a large online tourism platform, a sustainability label for accommodations can be predicted reasonably well with machine learning techniques. The algorithmic prediction of accommodations' sustainability using online data can provide a cost-effective and accurate measure that allows to track developments of tourism sustainability across the globe with high spatial and temporal granularity.

Supplementary information: The online version contains supplementary material available at 10.1140/epjds/s13688-022-00354-6.

Keywords: Imbalanced classification; Nowcasting; Platform data; Supervised learning; Sustainable tourism; TripAdvisor.