A crowd of emotional voices influences the perception of emotional faces: Using adaptation, stimulus salience, and attention to probe audio-visual interactions for emotional stimuli

Atten Percept Psychophys. 2020 Nov;82(8):3973-3992. doi: 10.3758/s13414-020-02104-0.

Abstract

Correctly assessing the emotional state of others is a crucial part of social interaction. While facial expressions provide much information, faces are often not viewed in isolation, but occur with concurrent sounds, usually voices, which also provide information about the emotion being portrayed. Many studies have examined the crossmodal processing of faces and sounds, but results have been mixed, with different paradigms yielding different results. Using a psychophysical adaptation paradigm, we carried out a series of four experiments to determine whether there is a perceptual advantage when faces and voices match in emotion (congruent), versus when they do not match (incongruent). We presented a single face and a crowd of voices, a crowd of faces and a crowd of voices, a single face of reduced salience and a crowd of voices, and tested this last condition with and without attention directed to the emotion in the face. While we observed aftereffects in the hypothesized direction (adaptation to faces conveying positive emotion yielded negative, contrastive, perceptual aftereffects), we only found a congruent advantage (stronger adaptation effects) when faces were attended and of reduced salience, in line with the theory of inverse effectiveness.

Keywords: Crossmodal; Multisensory emotional processing; Principle of inverse effectiveness; Visual salience.

MeSH terms

  • Attention
  • Emotions*
  • Facial Expression
  • Humans
  • Visual Perception
  • Voice*