An Entropy-Based Method with a New Benchmark Dataset for Chinese Textual Affective Structure Analysis

Entropy (Basel). 2023 May 13;25(5):794. doi: 10.3390/e25050794.

Abstract

Affective understanding of language is an important research focus in artificial intelligence. The large-scale annotated datasets of Chinese textual affective structure (CTAS) are the foundation for subsequent higher-level analysis of documents. However, there are very few published datasets for CTAS. This paper introduces a new benchmark dataset for the task of CTAS to promote development in this research direction. Specifically, our benchmark is a CTAS dataset with the following advantages: (a) it is Weibo-based, which is the most popular Chinese social media platform used by the public to express their opinions; (b) it includes the most comprehensive affective structure labels at present; and (c) we propose a maximum entropy Markov model that incorporates neural network features and experimentally demonstrate that it outperforms the two baseline models.

Keywords: Chinese benchmark datasets; affective computing; affective structure; corpus annotation.