Research on the prediction of English topic richness in the context of multimedia data

PeerJ Comput Sci. 2024 Apr 16:10:e1967. doi: 10.7717/peerj-cs.1967. eCollection 2024.

Abstract

With the evolution of the Internet and multimedia technologies, delving deep into multimedia data for predicting topic richness holds significant practical implications in public opinion monitoring and data discourse power competition. This study introduces an algorithm for predicting English topic richness based on the Transformer model, applied specifically to the Twitter platform. Initially, relevant data is organized and extracted following an analysis of Twitter's characteristics. Subsequently, a feature fusion approach is employed to mine, extract, and construct features from Twitter blogs and users, encompassing blog features, topic features, and user features, which are amalgamated into multimodal features. Lastly, the combined features undergo training and learning using the Transformer model. Through experimentation on the Twitter topic richness dataset, our algorithm achieves an accuracy of 82.3%, affirming the efficacy and superior performance of the proposed approach.

Keywords: Multi-modal features extraction; Multimedia data; Topic richness; Transformer.

Grants and funding

This work was supported by Princess Nourah bint Abdulrahman University Researchers Supporting Project number (PNURSP2024R54), Princess Nourah bint Abdulrahman University, Riyadh, Saudi Arabia. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.