How many words make a sample? Determining the minimum number of word tokens needed in connected speech samples for child speech assessment

Clin Linguist Phon. 2021 Aug 3;35(8):761-778. doi: 10.1080/02699206.2020.1827458. Epub 2020 Oct 6.

Abstract

Connected speech (CS) is an important component of child speech assessment in both clinical practice and research. There is debate in the literature regarding what size sample of CS is required to facilitate reliable measures of speech output. The aim of this study was to identify the minimum number of word tokens required to obtain a reliable measure of CS across a range of measures. Participants were 776 5-year-olds from a longitudinal community population cohort study (Avon Longitudinal Study of Parents and Children, ALSPAC). Children's narratives from a story retell task were audio-recorded and phonetically transcribed. Automatic analysis of the transcribed speech samples was completed using an automated transcription and analysis system. Measures of speech performance extracted included: a range of profiles of percentage consonant correct; frequency of substitutions, omissions, distortions and additions (SODA); percentage of syllable and stress pattern matches; and a measure of whole word complexity (Phonological Mean Length of Utterance, pMLU). Statistical analyses compared these measures at different CS sample sizes in increments using averages and weighted moving averages, and investigated how measures performed between CS samples grouped into word tokens of at least 50, 75 and 100, and restricted to samples of 50-74, 75-99 and 100-125. Key findings showed that sample sizes of 75 word tokens and above showed minimal differences in most measures of speech output, suggesting that the minimum requirement for samples of CS is a word count of 75. The exception to this is in the case of pMLU and measures of substitutions and distortions when a word count of 100 is recommended.

Keywords: Speech; alspac; connected speech; sample size; speech Sound Disorder; transcription.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Child, Preschool
  • Cohort Studies
  • Humans
  • Longitudinal Studies
  • Phonetics*
  • Speech Production Measurement
  • Speech*