SocialSift: Target Query Discovery on Online Social Media With Deep Reinforcement Learning

Changyu Wang; Pinghui Wang; Tao Qin; Chenxu Wang; Suhansanu Kumar; Xiaohong Guan; Jun Liu; Kevin Chen-Chuan Chang

doi:10.1109/TNNLS.2021.3130587

SocialSift: Target Query Discovery on Online Social Media With Deep Reinforcement Learning

IEEE Trans Neural Netw Learn Syst. 2023 Sep;34(9):5654-5668. doi: 10.1109/TNNLS.2021.3130587. Epub 2023 Sep 1.

Authors

Changyu Wang, Pinghui Wang, Tao Qin, Chenxu Wang, Suhansanu Kumar, Xiaohong Guan, Jun Liu, Kevin Chen-Chuan Chang

PMID: 34878981
DOI: 10.1109/TNNLS.2021.3130587

Abstract

Among the prohibitively large volume of posts (e.g., tweets in Twitter) on online social networks (OSNs), how to design effective queries to explore the ones of interest is a pressing problem. There are two main challenges to address the problem. First, given public application programming interfaces (APIs) for querying posts related to keywords from an extremely large vocabulary, how to infer the keywords relevant to our target interest using as few queries as possible? Second, how to deal with the agnostics of OSN's API? i.e., as different social networks typically have different running mechanisms, even with some randomness in returning results, how to build the knowledge of the API returns w.r.t. target interests from scratches? To address the above two challenges, we propose a target query discovery framework based on a deep reinforcement learning approach, named SocialSift. SocialSift intelligently interacts with OSNs' keyword-based API and develops its own knowledge in searching the optimal queries w.r.t. the target interests as well as OSN APIs. Specifically, to address the first challenge, we are inspired by the human searching experience, and recognize learning to query with context awareness to reduce the searching space, by qualifying keywords from returned results and keeping the tracks of the query trial history, or say contexts. As for addressing the second challenge, we treat OSNs' APIs as black boxes and probabilistically quantify query-interest pairs guided by rewards, which is a well-curated indicator w.r.t. target interests. Empirical results on three popular OSNs: Twitter, Reddit, and Amazon demonstrate our SocialSift significantly outperforms the state-of-the-art baselines by 12% in retrieving target posts.