A Hybrid Multimodal Emotion Recognition Framework for UX Evaluation Using Generalized Mixture Functions

Sensors (Basel). 2023 Apr 28;23(9):4373. doi: 10.3390/s23094373.

Abstract

Multimodal emotion recognition has gained much traction in the field of affective computing, human-computer interaction (HCI), artificial intelligence (AI), and user experience (UX). There is growing demand to automate analysis of user emotion towards HCI, AI, and UX evaluation applications for providing affective services. Emotions are increasingly being used, obtained through the videos, audio, text or physiological signals. This has led to process emotions from multiple modalities, usually combined through ensemble-based systems with static weights. Due to numerous limitations like missing modality data, inter-class variations, and intra-class similarities, an effective weighting scheme is thus required to improve the aforementioned discrimination between modalities. This article takes into account the importance of difference between multiple modalities and assigns dynamic weights to them by adapting a more efficient combination process with the application of generalized mixture (GM) functions. Therefore, we present a hybrid multimodal emotion recognition (H-MMER) framework using multi-view learning approach for unimodal emotion recognition and introducing multimodal feature fusion level, and decision level fusion using GM functions. In an experimental study, we evaluated the ability of our proposed framework to model a set of four different emotional states (Happiness, Neutral, Sadness, and Anger) and found that most of them can be modeled well with significantly high accuracy using GM functions. The experiment shows that the proposed framework can model emotional states with an average accuracy of 98.19% and indicates significant gain in terms of performance in contrast to traditional approaches. The overall evaluation results indicate that we can identify emotional states with high accuracy and increase the robustness of an emotion classification system required for UX measurement.

Keywords: audio-based emotion recognition; decision fusioning; emotion recognition; feature fusioning; generalized mixture function; user experience.

MeSH terms

  • Algorithms*
  • Artificial Intelligence*
  • Electroencephalography / methods
  • Emotions / physiology
  • Humans
  • Learning
  • Recognition, Psychology

Grants and funding

This work was supported by Institute of Information & communications Technology Planning & Evaluation (IITP) grant funded by the Korea government(MSIT) (IITP-2022-0-00078, Explainable Logical Reasoning for Medical Knowledge Generation) and (IITP-2017-0-00655, Lean UX core technology and platform for any digital artifacts UX evaluation) and the Grand Information Technology Research Center support program (IITP-2022-2020-0-01489).