Validation of a province-wide commercial food store dataset in a heterogeneous predominantly rural food environment

Public Health Nutr. 2020 Aug;23(11):1889-1895. doi: 10.1017/S1368980019004506. Epub 2020 Apr 16.

Abstract

Objective: Commercially available business (CAB) datasets for food environments have been investigated for error in large urban contexts and some rural areas, but there is a relative dearth of literature that reports error across regions of variable rurality. The objective of the current study was to assess the validity of a CAB dataset using a government dataset at the provincial scale.

Design: A ground-truthed dataset provided by the government of Newfoundland and Labrador (NL) was used to assess a popular commercial dataset. Concordance, sensitivity, positive-predictive value (PPV) and geocoding errors were calculated. Measures were stratified by store types and rurality to investigate any association between these variables and database accuracy.

Setting: NL, Canada.

Participants: The current analysis used store-level (ecological) data.

Results: Of 1125 stores, there were 380 stores that existed in both datasets and were considered true-positive stores. The mean positional error between a ground-truthed and test point was 17·72 km. When compared with the provincial dataset of businesses, grocery stores had the greatest agreement, sensitivity = 0·64, PPV = 0·60 and concordance = 0·45. Gas stations had the least agreement, sensitivity = 0·26, PPV = 0·32 and concordance = 0·17. Only 4 % of commercial data points in rural areas matched every criterion examined.

Conclusions: The commercial dataset exhibits a low level of agreement with the ground-truthed provincial data. Particularly retailers in rural areas or belonging to the gas station category suffered from misclassification and/or geocoding errors. Taken together, the commercial dataset is differentially representative of the ground-truthed reality based on store-type and rurality/urbanity.

Keywords: Canada; Commercial data; Food environment; Food retail; Rurality; Secondary data; Store-type; Validation.

Publication types

  • Research Support, Non-U.S. Gov't
  • Validation Study

MeSH terms

  • Commerce / statistics & numerical data*
  • Databases, Factual
  • Datasets as Topic / standards*
  • Food Supply / statistics & numerical data*
  • Government
  • Humans
  • Newfoundland and Labrador
  • Predictive Value of Tests
  • Reproducibility of Results
  • Rural Population / statistics & numerical data*
  • Sensitivity and Specificity
  • Social Environment*
  • Urban Population / statistics & numerical data