A computational and empirical investigation of graphemes in reading

Conrad Perry; Johannes C Ziegler; Marco Zorzi

doi:10.1111/cogs.12030

A computational and empirical investigation of graphemes in reading

Cogn Sci. 2013 Jul;37(5):800-28. doi: 10.1111/cogs.12030. Epub 2013 Mar 14.

Authors

Conrad Perry¹, Johannes C Ziegler, Marco Zorzi

Affiliation

¹ Faculty of Life and Social Sciences, Swinburne University of Technology, Australia. conradperry@gmail.com

PMID: 23489148
DOI: 10.1111/cogs.12030

Abstract

It is often assumed that graphemes are a crucial level of orthographic representation above letters. Current connectionist models of reading, however, do not address how the mapping from letters to graphemes is learned. One major challenge for computational modeling is therefore developing a model that learns this mapping and can assign the graphemes to linguistically meaningful categories such as the onset, vowel, and coda of a syllable. Here, we present a model that learns to do this in English for strings of any letter length and any number of syllables. The model is evaluated on error rates and further validated on the results of a behavioral experiment designed to examine ambiguities in the processing of graphemes. The results show that the model (a) chooses graphemes from letter strings with a high level of accuracy, even when trained on only a small portion of the English lexicon; (b) chooses a similar set of graphemes as people do in situations where different graphemes can potentially be selected; (c) predicts orthographic effects on segmentation which are found in human data; and (d) can be readily integrated into a full-blown model of multi-syllabic reading aloud such as CDP++ (Perry, Ziegler, & Zorzi, 2010). Altogether, these results suggest that the model provides a plausible hypothesis for the kind of computations that underlie the use of graphemes in skilled reading.

Keywords: Computational modeling; Connectionism; Graphemes; Orthography; Reading.

Publication types

Research Support, Non-U.S. Gov't

MeSH terms

Humans
Language*
Learning
Linguistics
Models, Theoretical*
Phonetics
Reading*