A Korean emotion-factor dataset for extracting emotion and factors in Korean conversations

Sci Rep. 2023 Oct 29;13(1):18547. doi: 10.1038/s41598-023-45386-8.

Abstract

Humans express their emotions in various ways, such as through facial expressions and voices. In particular, emotions are directly expressed or indirectly implied in the text of utterance. Research on the technology to identify emotions included in human speech and generate utterances is being conducted in conversational artificial intelligence technology. Despite the importance of recognizing the factors of previously generated emotions to generate emotion-based utterances, most of the existing datasets only provide the classification of emotions in text and utterances. In addition, in the case of Korean datasets, the classification of emotions is not diverse, and it is mainly biased toward negative emotion classification. In this paper, we propose KEmoFact, a Korean emotion-factor dataset for extracting emotion and factors in Korean conversations. We also define two tasks for the KEmoFact dataset, EFE (Emotion Factor Extraction) and EFPE (Emotion-Factor Pair Extraction), and propose baseline models for the tasks. We contribute to the study of conversational artificial intelligence, especially in Korean, one of the low-resource languages, by proposing the KEmoFact dataset and suggesting baseline models for two tasks.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Artificial Intelligence*
  • Communication
  • Emotions*
  • Humans
  • Republic of Korea
  • Speech