Learning phonetic categories by tracking movements

Bruno Gauthier; Rushen Shi; Yi Xu

doi:10.1016/j.cognition.2006.03.002

Learning phonetic categories by tracking movements

Cognition. 2007 Apr;103(1):80-106. doi: 10.1016/j.cognition.2006.03.002. Epub 2006 May 2.

Authors

Bruno Gauthier¹, Rushen Shi, Yi Xu

Affiliation

¹ Département de psychologie, Université du Québec à Montréal, C.P. 8888, Succursale Centre-Ville, Montréal, Que., Canada H3C 3P8. gauthier.bruno@courrier.uqam.ca

PMID: 16650399
DOI: 10.1016/j.cognition.2006.03.002

Abstract

We explore in this study how infants may derive phonetic categories from adult input that are highly variable. Neural networks in the form of self-organizing maps (SOMs; ) were used to simulate unsupervised learning of Mandarin tones. In Simulation 1, we trained the SOMs with syllable-sized continuous F(0) contours, produced by multiple speakers in connected speech, and with the corresponding velocity profiles (D1). No attempt was made to reduce the large amount of variability in the input or to add to the input any abstract features such as height and slope of the F(0) contours. In the testing phase, reasonably high categorization rate was achieved with F(0) profiles, but D1 profiles yielded almost perfect categorization of the four tones. Close inspection of the learned prototypical D1 profile clusters revealed that they had effectively eliminated surface variability and directly reflected articulatory movements toward the underlying targets of the four tones as proposed by . Additional simulations indicated that a further learning step was possible through which D1 prototypes with one-to-one correspondence to the tones were derived from the prototype clusters learned in Simulation 1. Implications of these findings for theories of language acquisition, speech perception and speech production are discussed.

Publication types

Research Support, N.I.H., Extramural
Research Support, Non-U.S. Gov't

MeSH terms

Humans
Infant
Language
Models, Psychological
Movement*
Phonetics*
Speech Perception
Verbal Learning*

Grants and funding

DC006243/DC/NIDCD NIH HHS/United States