Understanding Naturalistic Facial Expressions with Deep Learning and Multimodal Large Language Models

Yifan Bian; Dennis Küster; Hui Liu; Eva G Krumhuber

doi:10.3390/s24010126

Understanding Naturalistic Facial Expressions with Deep Learning and Multimodal Large Language Models

Sensors (Basel). 2023 Dec 26;24(1):126. doi: 10.3390/s24010126.

Authors

Yifan Bian¹, Dennis Küster², Hui Liu², Eva G Krumhuber¹

Affiliations

¹ Department of Experimental Psychology, University College London, London WC1H 0AP, UK.
² Department of Mathematics and Computer Science, University of Bremen, 28359 Bremen, Germany.

Abstract

This paper provides a comprehensive overview of affective computing systems for facial expression recognition (FER) research in naturalistic contexts. The first section presents an updated account of user-friendly FER toolboxes incorporating state-of-the-art deep learning models and elaborates on their neural architectures, datasets, and performances across domains. These sophisticated FER toolboxes can robustly address a variety of challenges encountered in the wild such as variations in illumination and head pose, which may otherwise impact recognition accuracy. The second section of this paper discusses multimodal large language models (MLLMs) and their potential applications in affective science. MLLMs exhibit human-level capabilities for FER and enable the quantification of various contextual variables to provide context-aware emotion inferences. These advancements have the potential to revolutionize current methodological approaches for studying the contextual influences on emotions, leading to the development of contextualized emotion models.

Keywords: automatic facial expression recognition; deep learning; multimodal large language model; naturalistic context.

Publication types

Review

MeSH terms

Awareness
Deep Learning*
Emotions
Facial Expression
Humans
Language

Grants and funding

This research received no external funding.