Talker familiarity and the accommodation of talker variability

James S Magnuson; Howard C Nusbaum; Reiko Akahane-Yamada; David Saltzman

doi:10.3758/s13414-020-02203-y

Talker familiarity and the accommodation of talker variability

Atten Percept Psychophys. 2021 May;83(4):1842-1860. doi: 10.3758/s13414-020-02203-y. Epub 2021 Jan 4.

Authors

James S Magnuson¹, Howard C Nusbaum², Reiko Akahane-Yamada³, David Saltzman⁴

Affiliations

¹ Department of Psychological Sciences, and CT Institute for the Brain and Cognitive Sciences, University of Connecticut, 406 Babbidge Road, Unit 1020, Storrs, CT, 06269-1020, USA. james.magnuson@uconn.edu.
² Department of Psychology, University of Chicago, Chicago, IL, USA.
³ Advanced Telecommunications Research Institute International, Kyoto, Japan.
⁴ Department of Psychological Sciences, and CT Institute for the Brain and Cognitive Sciences, University of Connecticut, 406 Babbidge Road, Unit 1020, Storrs, CT, 06269-1020, USA.

PMID: 33398658
DOI: 10.3758/s13414-020-02203-y

Abstract

A fundamental problem in speech perception is how (or whether) listeners accommodate variability in the way talkers produce speech. One view of the way listeners cope with this variability is that talker differences are normalized - a mapping between talker-specific characteristics and phonetic categories is computed such that speech is recognized in the context of the talker's vocal characteristics. Consistent with this view, listeners process speech more slowly when the talker changes randomly than when the talker remains constant. An alternative view is that speech perception is based on talker-specific auditory exemplars in memory clustered around linguistic categories that allow talker-independent perception. Consistent with this view, listeners become more efficient at talker-specific phonetic processing after voice identification training. We asked whether phonetic efficiency would increase with talker familiarity by testing listeners with extremely familiar talkers (family members), newly familiar talkers (based on laboratory training), and unfamiliar talkers. We also asked whether familiarity would reduce the need for normalization. As predicted, phonetic efficiency (word recognition in noise) increased with familiarity (unfamiliar < trained-on < family). However, we observed a constant processing cost for talker changes even for pairs of family members. We discuss how normalization and exemplar theories might account for these results, and constraints the results impose on theoretical accounts of phonetic constancy.

Keywords: Psycholinguistics; Speech perception.

MeSH terms

Humans
Phonetics
Recognition, Psychology
Speech
Speech Perception*
Voice*

Grants and funding

1754284/National Science Foundation