Error Measures for Trajectories Estimations with Geo-tagged Mobility Sample Data

IEEE trans Intell Transp Syst. 2019 Jul;20(7):2566-2583. doi: 10.1109/TITS.2018.2868182. Epub 2018 Oct 20.

Abstract

Although geo-tagged mobility data (e.g., cell phone data and social media data) can be potentially used to estimate individual space-time travel trajectories, they often have low sample rates that only tell travelers' whereabouts at the sparse sample times while leaving the remaining activities to be estimated with interpolation. This study proposes a set of time geography-based measures to quantify the accuracy of the trajectory estimation in a robust manner. A series of measures including activity bandwidth and normalized activity bandwidth are proposed to quantify the possible absolute and relative error ranges between the estimated and the ground truth trajectories that cannot be observed. These measures can be used to evaluate the suitability of the estimated individual trajectories from sparsely sampled geo-tagged mobility data for travel mobility analysis. We suggest cutoff values of these measures to separate useful data with low estimation errors and noisy data with high estimation errors. We conduct theoretical analysis to show that these error measures decrease with sample rates and people's activity ranges. We also propose a lookup table-based interpolation method to expedite the computational time. The proposed measures have been applied to 2013 geo-tagged tweet data in New York City and 2014 cell-phone data in Shenzhen, China. The results illustrate that the proposed measures can provide estimation error ranges for exceptionally large datasets in much shorter times than the benchmark method without using lookup tables. These results also reveal managerial results into the quality of these data for human mobility studies, including their distribution patterns.

Keywords: Geo-tagged data; activity range; cellphone; social media; time geography; trajectory estimation.