Efficient processing of top-k frequent spatial keyword queries

Sci Rep. 2022 May 5;12(1):7352. doi: 10.1038/s41598-022-10648-4.

Abstract

The rapid popularization of high-speed mobile communication technology and the continuous development of mobile network devices have given spatial textual big data (STBD) new dimensions due to their ability to record geographical objects from multiple sources and with complex attributes. Data mining from spatial textual datasets has become a meaningful study. As a popular topic for STBD, the top-k spatial keyword query has been developed in various forms to deal with different retrievals requirements. However, previous research focused mainly on indexing locational attributes and retrievals of few target attributes, and these correlations between large numbers of the textual attributes have not been fully studied and demonstrated. To further explore interrelated-knowledge in the textual attributes, this paper defines the top-k frequent spatial keyword query (tfSKQ) and proposes a novel hybrid index structure, named RCL-tree, based on the concept lattice theory. We also develop the tfSKQ algorithms to retrieve the most frequent and nearest spatial objects in STBD. One existing method and two baseline algorithms are implemented, and a series of experiments are carried out using real datasets to evaluate its performance. Results demonstrated the effectiveness and efficiency of the proposed RCL-tree in tfSKQ with the complex spatial multi keyword query conditions.

MeSH terms

  • Algorithms*
  • Data Mining* / methods

Associated data

  • figshare/10.6084/m9.figshare.15052236