Query the trajectory based on the precise track: a Bloom filter-based approach

Geoinformatica. 2021;25(2):397-416. doi: 10.1007/s10707-021-00433-2. Epub 2021 Mar 15.

Abstract

Fast and precise querying in a given set of trajectory points is an important issue of trajectory query. Typically, there are massive trajectory data in the database, yet the query sets only have a few points, which is a challenge for the superior performance of trajectory querying. The current trajectory query methods commonly use the tree-based index structure and the signature-based method to classify, simplify, and filter the trajectory to improve the performance. However, the unstructured essence and the spatiotemporal heterogeneity of the trajectory-sequence lead these methods to a high degree of spatial overlap, frequent I/O, and high memory occupation. Thus, they are not suitable for the time-critical tasks of trajectory big data. In this paper, a query method of trajectory is developed on the Bloom Filter. Based on the gridded space and geocoding, the spatial trajectory sequences (tracks) query is transformed into the query of the text string. The geospace was regularly divided by the geographic grid, and each cell was assigned an independent geocode, converting the high-dimensional irregular space trajectory query into a one-dimensional string query. The point in each cell is regarded as a signature, which forms a mapping to the bit-array of the Bloom Filter. This conversion effectively eliminates the high degree of overlap and instability of query performance. Meanwhile, the independent coding ensures the uniqueness of the whole tracks. In this method, there is no need for additional I/O on the raw trajectory data when the track is queried. Compared to the original data, the memory occupied by this method is negligible. Based on Beijing Taxi and Shenzhen bus trajectory data, an experiment using this method was constructed, and random queries under a variety of conditions boundaries were constructed. The results verified that the performance and stability of our method, compared to R*tree index, have been improved by 2000 to 4000 times, based on one million to tens of millions of trajectory data. And the Bloom Filter-based query method is hardly affected by grid size, original data size, and length of tracks. With such a time advantage, our method is suitable for time-critical spatial computation tasks, such as anti-terrorism, public safety, epidemic prevention, and control, etc.

Keywords: Fast trajectory query; Geography big data; The Bloom filter; Track query.