On the rules of continuity and symmetry for the data quality of street networks

PLoS One. 2018 Jul 12;13(7):e0200334. doi: 10.1371/journal.pone.0200334. eCollection 2018.

Abstract

Knowledge or rule-based approaches are needed for quality assessment and assurance in professional or crowdsourced geographic data. Nevertheless, many types of geographic knowledge are statistical in nature and are therefore difficult to derive rules that are meaningful for this purpose. The rules of continuity and symmetry considered in this paper can be thought of as two concrete forms of the first law of geography, which may be used to formulate quality measures at the individual level without referring to ground truth. It is not clear, however, how much the rules can be faithful. Hence, the main objective is to test if the rules are consistent with street network data over the world. Specifically, for the rule of continuity we identify natural streets that connect smoothly in a network, and measure the spatial order of information (e.g. names, highway level, speed, etc.) along the streets. The measure is based on spatial auto-correlation indicators adapted for one dimension. For the rule of symmetry, we device an algorithm that recognize parallel road pairs (e.g. dual carriageways), and examine to what extent attributes in the pairs are identical. The two rules are tested against 28 cities selected from OpenStreetMap data worldwide; two professional data sets are used to show more insights. We found that the rules are consistent with street networks from a wide range of cities of different characteristics, and also noted cases with varying degrees of agreement. As a side-effect, we discussed possible limitations of the autocorrelation indicators used, where cautions are needed when interpreting the results. In addition, we present techniques that performed the tests automatically, which can be applied to new data to further verify (or falsify) our findings, or extended as quality assurance tools to detect data items that do not satisfy the rules and to suggest possible corrections according to the rules.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Algorithms
  • Cities
  • Data Accuracy*
  • Geography*
  • Humans
  • Models, Theoretical
  • Oxytetracycline

Substances

  • Oxytetracycline

Grants and funding

XZ and TA are supported by the National Key Research and Development Program of China (grant 2017YFB0503500); XZ by the National Natural Science Foundation of China (grant 41671384 and 41301410); TA by the National High Technology Research and Development Program of China (grant 2015AA123901). The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript. Hi-Target Surveying Instrument Co., Ltd., Wuhan Hi-Target Digital Cloud Technology Co., Ltd. and NavInfo Co., Ltd. provided support in the form of salaries for authors JY and ZW, but did not have any additional role in the study design, data collection and analysis, decision to publish, or preparation of the manuscript. The specific roles of these authors are articulated in the ‘author contributions’ section.