Multimodal processing in face-to-face interactions: A bridging link between psycholinguistics and sensory neuroscience

Stefania Benetti; Ambra Ferrari; Francesco Pavani

doi:10.3389/fnhum.2023.1108354

Multimodal processing in face-to-face interactions: A bridging link between psycholinguistics and sensory neuroscience

Front Hum Neurosci. 2023 Feb 2:17:1108354. doi: 10.3389/fnhum.2023.1108354. eCollection 2023.

Authors

Stefania Benetti^{1

2}, Ambra Ferrari³, Francesco Pavani^{1

2}

Affiliations

¹ Centre for Mind/Brain Sciences, University of Trento, Trento, Italy.
² Interuniversity Research Centre "Cognition, Language, and Deafness", CIRCLeS, Catania, Italy.
³ Max Planck Institute for Psycholinguistics, Donders Institute for Brain, Cognition, and Behaviour, Radboud University, Nijmegen, Netherlands.

Abstract

In face-to-face communication, humans are faced with multiple layers of discontinuous multimodal signals, such as head, face, hand gestures, speech and non-speech sounds, which need to be interpreted as coherent and unified communicative actions. This implies a fundamental computational challenge: optimally binding only signals belonging to the same communicative action while segregating signals that are not connected by the communicative content. How do we achieve such an extraordinary feat, reliably, and efficiently? To address this question, we need to further move the study of human communication beyond speech-centred perspectives and promote a multimodal approach combined with interdisciplinary cooperation. Accordingly, we seek to reconcile two explanatory frameworks recently proposed in psycholinguistics and sensory neuroscience into a neurocognitive model of multimodal face-to-face communication. First, we introduce a psycholinguistic framework that characterises face-to-face communication at three parallel processing levels: multiplex signals, multimodal gestalts and multilevel predictions. Second, we consider the recent proposal of a lateral neural visual pathway specifically dedicated to the dynamic aspects of social perception and reconceive it from a multimodal perspective ("lateral processing pathway"). Third, we reconcile the two frameworks into a neurocognitive model that proposes how multiplex signals, multimodal gestalts, and multilevel predictions may be implemented along the lateral processing pathway. Finally, we advocate a multimodal and multidisciplinary research approach, combining state-of-the-art imaging techniques, computational modelling and artificial intelligence for future empirical testing of our model.

Keywords: face-to-face interactions; lateral cortical processing pathway; multimodal communication; psycholinguistics; sensory neuroscience; social actions.

Grants and funding

SB was supported by a “Starting Grant DM 737/21” from the University of Trento (R06). SB and FP were supported by a “Progetto di Rilevante Interesse Nazionale (PRIN)” from the Italian Ministry for Education, University and Research (MIUR-PRIN 2017 n.20177894ZH).