Learning phonemes with a proto-lexicon

Andrew Martin; Sharon Peperkamp; Emmanuel Dupoux

doi:10.1111/j.1551-6709.2012.01267.x

Learning phonemes with a proto-lexicon

Cogn Sci. 2013 Jan-Feb;37(1):103-24. doi: 10.1111/j.1551-6709.2012.01267.x. Epub 2012 Sep 17.

Authors

Andrew Martin¹, Sharon Peperkamp, Emmanuel Dupoux

Affiliation

¹ Laboratoire de Sciences Cognitives et Psycholinguistique (EHESS-ENS-CNRS) Laboratory for Language Development, RIKEN Brain Science Institute, Saitama, Japan. amartin@brain.riken.jp

PMID: 22985465
DOI: 10.1111/j.1551-6709.2012.01267.x

Abstract

Before the end of the first year of life, infants begin to lose the ability to perceive distinctions between sounds that are not phonemic in their native language. It is typically assumed that this developmental change reflects the construction of language-specific phoneme categories, but how these categories are learned largely remains a mystery. Peperkamp, Le Calvez, Nadal, and Dupoux (2006) present an algorithm that can discover phonemes using the distributions of allophones as well as the phonetic properties of the allophones and their contexts. We show that a third type of information source, the occurrence of pairs of minimally differing word forms in speech heard by the infant, is also useful for learning phonemic categories and is in fact more reliable than purely distributional information in data containing a large number of allophones. In our model, learners build an approximation of the lexicon consisting of the high-frequency n-grams present in their speech input, allowing them to take advantage of top-down lexical information without needing to learn words. This may explain how infants have already begun to exhibit sensitivity to phonemic categories before they have a large receptive lexicon.

Publication types

Research Support, Non-U.S. Gov't

MeSH terms

Algorithms
Artificial Intelligence
Comprehension
Humans
Infant
Language Development*
Models, Statistical
Phonetics*
Psycholinguistics
Semantics
Speech Perception
Verbal Learning
Vocabulary