Leveraging insurance customer data to characterize socioeconomic indicators of Swiss municipalities

PLoS One. 2021 Mar 3;16(3):e0246785. doi: 10.1371/journal.pone.0246785. eCollection 2021.

Abstract

The availability of reliable socioeconomic data is critical for the design of urban policies and the implementation of location-based services; however, often, their temporal and geographical coverage remain scarce. We explore the potential for insurance customers data to predict socioeconomic indicators of Swiss municipalities. First, we define a features space by aggregating at city-level individual customer data along several behavioral and user profile dimensions. Second, we collect official statistics shared by the Swiss authorities on a wide spectrum of categories: Population, Transportation, Work, Space and Territory, Housing, and Economy. Third, we adopt two spatial regression models exploring both global and local geographical dependencies to investigate their predictability. Results show consistently a correlation between insurance customer characteristics and official socioeconomic indexes. Performance fluctuates depending on the category, with values of R2 > 0.6 for several target variables using a 5-fold cross validation. As a case study, we focus on predicting the percentage of the population using public transportation and we discuss the implications on a regional scope. We believe that this methodology can support official statistical offices and it could open up new opportunities for the characterization of socioeconomic traits at highly-granular spatial and temporal scales.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Censuses
  • Cities
  • Databases, Factual
  • Economic Development*
  • Housing
  • Humans
  • Insurance*
  • Motor Vehicles
  • Population Dynamics*
  • Regression Analysis
  • Socioeconomic Factors
  • Switzerland

Grants and funding

This study was supported by La Mobilière Insurance, Characterization of Quality of Space of Urban System, in the form of funding awarded to the HERUS Lab for the salary of a postdoc for the duration of 2 years and through coupling the data of insurance customers with statistically available data (proposal reference 18219_181217). The funder had no further role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.