Interaction of bottom-up and top-down neural mechanisms in spatial multi-talker speech perception

Curr Biol. 2022 Sep 26;32(18):3971-3986.e4. doi: 10.1016/j.cub.2022.07.047. Epub 2022 Aug 15.

Abstract

How the human auditory cortex represents spatially separated simultaneous talkers and how talkers' locations and voices modulate the neural representations of attended and unattended speech are unclear. Here, we measured the neural responses from electrodes implanted in neurosurgical patients as they performed single-talker and multi-talker speech perception tasks. We found that spatial separation between talkers caused a preferential encoding of the contralateral speech in Heschl's gyrus (HG), planum temporale (PT), and superior temporal gyrus (STG). Location and spectrotemporal features were encoded in different aspects of the neural response. Specifically, the talker's location changed the mean response level, whereas the talker's spectrotemporal features altered the variation of response around response's baseline. These components were differentially modulated by the attended talker's voice or location, which improved the population decoding of attended speech features. Attentional modulation due to the talker's voice only appeared in the auditory areas with longer latencies, but attentional modulation due to location was present throughout. Our results show that spatial multi-talker speech perception relies upon a separable pre-attentive neural representation, which could be further tuned by top-down attention to the location and voice of the talker.

Keywords: attention; auditory cortex; sound localization in humans; spatial multi-talker speech perception.

Publication types

  • Research Support, N.I.H., Extramural

MeSH terms

  • Auditory Cortex* / physiology
  • Humans
  • Speech
  • Speech Perception* / physiology
  • Temporal Lobe
  • Voice*