On selection bias in comparison measures of smartphone-generated population mobility: an illustration of no-bias conditions with a commercial data source

Ann Epidemiol. 2022 Jun:70:16-22. doi: 10.1016/j.annepidem.2022.03.003. Epub 2022 Mar 12.

Abstract

Purpose: Passively generated cell-phone location ("mobility") data originally intended for commercial use has become frequently used in epidemiologic research, notably during the COVID-19 pandemic to study the impact of physical-distancing recommendations on aggregate population behavior (e.g., average daily mobility). Given the opaque nature of how individuals are selected into these datasets, researchers have cautioned that their use may give rise to selection bias, yet little guidance exists for assessing this potential threat to validity in mobility-data research. Through an example analysis of cell-phone-derived mobility data, we present a set of conditions to guide the assessment of selection bias in measures comparing aggregate mobility patterns over time and between groups.

Methods: We specifically consider bias in measures comparing group-level mobility in the same group (difference, ratio, percent difference) and between groups (difference in differences, ratio of ratios, ratio of percent differences). We illustrate no-bias conditions in these measures through an example comparing block-group-level mobility between income groups in United States metro areas before (January 1st-March 10, 2020) and after (March 11th-April 19th, 2020) the day COVID-19 was declared a pandemic.

Results: Within-group contrasts describing mobility over time, especially for the higher-income decile, were expected to be most resistant to bias during the example study period.

Conclusions: The presented conditions can be used to assess the susceptibility to selection bias of group-level measures comparing mobility. Importantly, they can be used even without knowledge of the degree of bias in each group at each time point. We further highlight links between no-bias principles originating in epidemiology and economics, showing that certain assumptions (e.g., parallel trends) can apply to biases beyond their original application.

Keywords: Big data; Covid-19; Difference-in-differences; Epidemiologic methods; Mobility; Selection bias.

Publication types

  • Research Support, N.I.H., Extramural

MeSH terms

  • Bias
  • COVID-19* / epidemiology
  • Humans
  • Information Storage and Retrieval
  • Pandemics*
  • Selection Bias
  • Smartphone
  • United States