Finding the meaning in meaning maps: Quantifying the roles of semantic and non-semantic scene information in guiding visual attention

Cognition. 2024 Jun:247:105788. doi: 10.1016/j.cognition.2024.105788. Epub 2024 Apr 5.

Abstract

In real-world vision, people prioritise the most informative scene regions via eye-movements. According to the cognitive guidance theory of visual attention, viewers allocate visual attention to those parts of the scene that are expected to be the most informative. The expected information of a scene region is coded in the semantic distribution of that scene. Meaning maps have been proposed to capture the spatial distribution of local scene semantics in order to test cognitive guidance theories of attention. Notwithstanding the success of meaning maps, the reason for their success has been contested. This has led to at least two possible explanations for the success of meaning maps in predicting visual attention. On the one hand, meaning maps might measure scene semantics. On the other hand, meaning maps might measure scene features, overlapping with, but distinct from, scene semantics. This study aims to disentangle these two sources of information by considering both conceptual information and non-semantic scene entropy simultaneously. We found that both semantic and non-semantic information is captured by meaning maps, but scene entropy accounted for more unique variance in the success of meaning maps than conceptual information. Additionally, some explained variance was unaccounted for by either source of information. Thus, although meaning maps may index some aspect of semantic information, their success seems to be better explained by non-semantic information. We conclude that meaning maps may not yet be a good tool to test cognitive guidance theories of attention in general, since they capture non-semantic aspects of local semantic density and only a small portion of conceptual information. Rather, we suggest that researchers should better define the exact aspect of cognitive guidance theories they wish to test and then use the tool that best captures that desired semantic information. As it stands, the semantic information contained in meaning maps seems too ambiguous to draw strong conclusions about how and when semantic information guides visual attention.

Keywords: Conceptual similarity; Entropy; Eye movements; Meaning maps; Scene semantics; Visual attention.