How infants' utterances grow: A probabilistic account of early language development

Cognition. 2023 Jan:230:105275. doi: 10.1016/j.cognition.2022.105275. Epub 2022 Oct 7.

Abstract

Why are children's first utterances short and ungrammatical, with some obvious constructions missing? What determines the lengthening of children's early utterances over time? The literature is replete with references to a one-word, a two-word, and a later multiword stage in language development, but with little empirical evidence, and with little account for how and why utterances grow. To address these questions, we analyze speech samples from 25 children between the ages of 14 and 43 months; we construct distributions of their utterances of lengths one to five by age. Our novel findings are that multiword utterances of different lengths appear early in acquisition and increase together until they reach relatively stable proportions similar to those found in parents' input. To explain such patterns, we develop a probabilistic computational model, VIRTUAL, that posits an interaction between a) varying, increasing resources from various developmental domains and b) target utterance lengths mirroring the input. VIRTUAL successfully accounts for most of the empirical patterns, suggesting a probabilistic and dynamic process that is nonetheless compatible with apparent distinct milestones in development. We provide a new, systematic way of showing how developmental cascade theories could work in language development. Our findings and model also suggest insights into syntactic, semantic, and cognitive development.

Keywords: Computational models; Corpus linguistics; Language acquisition; Utterance length.

MeSH terms

  • Child
  • Child Language
  • Child, Preschool
  • Creativity
  • Humans
  • Infant
  • Language Development*
  • Linguistics
  • Semantics
  • Speech*