Weighting of cues to categorization of song versus speech in tone-language and non-tone-language speakers

Magdalena Kachlicka; Aniruddh D Patel; Fang Liu; Adam Tierney

doi:10.1016/j.cognition.2024.105757

Weighting of cues to categorization of song versus speech in tone-language and non-tone-language speakers

Cognition. 2024 May:246:105757. doi: 10.1016/j.cognition.2024.105757. Epub 2024 Mar 4.

Authors

Magdalena Kachlicka¹, Aniruddh D Patel², Fang Liu³, Adam Tierney⁴

Affiliations

¹ Department of Psychological Sciences, Birkbeck, University of London, Malet Street, London, United Kingdom.
² Department of Psychology, Tufts University, 419 Boston Ave, Medford, USA; Program in Brain, Mind, and Consciousness, Canadian Institute for Advanced Research, 661 University Avenue, Toronto, Canada.
³ School of Psychology and Clinical Language Sciences, University of Reading, Whiteknights, Reading, United Kingdom.
⁴ Department of Psychological Sciences, Birkbeck, University of London, Malet Street, London, United Kingdom. Electronic address: a.tierney@bbk.ac.uk.

PMID: 38442588
DOI: 10.1016/j.cognition.2024.105757

Abstract

One of the most important auditory categorization tasks a listener faces is determining a sound's domain, a process which is a prerequisite for successful within-domain categorization tasks such as recognizing different speech sounds or musical tones. Speech and song are universal in human cultures: how do listeners categorize a sequence of words as belonging to one or the other of these domains? There is growing interest in the acoustic cues that distinguish speech and song, but it remains unclear whether there are cross-cultural differences in the evidence upon which listeners rely when making this fundamental perceptual categorization. Here we use the speech-to-song illusion, in which some spoken phrases perceptually transform into song when repeated, to investigate cues to this domain-level categorization in native speakers of tone languages (Mandarin and Cantonese speakers residing in the United Kingdom and China) and in native speakers of a non-tone language (English). We find that native tone-language and non-tone-language listeners largely agree on which spoken phrases sound like song after repetition, and we also find that the strength of this transformation is not significantly different across language backgrounds or countries of residence. Furthermore, we find a striking similarity in the cues upon which listeners rely when perceiving word sequences as singing versus speech, including small pitch intervals, flat within-syllable pitch contours, and steady beats. These findings support the view that there are certain widespread cross-cultural similarities in the mechanisms by which listeners judge if a word sequence is spoken or sung.

Keywords: Categorization; Cue-weighting; Illusion; Song; Speech.

MeSH terms

Cues
Humans
Language
Phonetics
Pitch Perception
Speech Perception*
Speech*