Testing a computational model of causative overgeneralizations: Child judgment and production data from English, Hebrew, Hindi, Japanese and K'iche'

Ben Ambridge; Laura Doherty; Ramya Maitreyee; Tomoko Tatsumi; Shira Zicherman; Pedro Mateo Pedro; Ayuno Kawakami; Amy Bidgood; Clifton Pye; Bhuvana Narasimhan; Inbal Arnon; Dani Bekman; Amir Efrati; Sindy Fabiola Can Pixabaj; Mario Marroquín Pelíz; Margarita Julajuj Mendoza; Soumitra Samanta; Seth Campbell; Stewart McCauley; Ruth Berman; Dipti Misra Sharma; Rukmini Bhaya Nair; Kumiko Fukumura

doi:10.12688/openreseurope.13008.2

Testing a computational model of causative overgeneralizations: Child judgment and production data from English, Hebrew, Hindi, Japanese and K'iche'

Open Res Eur. 2022 Jan 12:1:1. doi: 10.12688/openreseurope.13008.2. eCollection 2021.

Authors

Ben Ambridge^{1

2}, Laura Doherty¹, Ramya Maitreyee¹, Tomoko Tatsumi³, Shira Zicherman⁴, Pedro Mateo Pedro⁵, Ayuno Kawakami¹, Amy Bidgood⁶, Clifton Pye⁷, Bhuvana Narasimhan⁸, Inbal Arnon⁴, Dani Bekman⁴, Amir Efrati⁴, Sindy Fabiola Can Pixabaj⁵, Mario Marroquín Pelíz⁵, Margarita Julajuj Mendoza⁵, Soumitra Samanta^{1

2}, Seth Campbell⁹, Stewart McCauley¹⁰, Ruth Berman¹¹, Dipti Misra Sharma¹², Rukmini Bhaya Nair¹³, Kumiko Fukumura¹⁴

Affiliations

¹ University of Liverpool, Liverpool, UK.
² ESRC International Centre for Language and Communicative Development (LuCiD), Liverpool, UK.
³ Kobe University, Kobe, Japan.
⁴ Hebrew University of Jerusalem, Jerusalem, Israel.
⁵ Universidad del Valle de Guatemala, Guatemala City, Guatemala.
⁶ University of Salford, Salford, UK.
⁷ University of Kansas, Lawrence, Kansas, USA.
⁸ University of Colorado, Boulder, Boulder, Colorado, USA.
⁹ University of Calgary, Calgary, Canada.
¹⁰ University of Iowa, Iowa City, Iowa, USA.
¹¹ Tel Aviv University, Tel Aviv, Israel.
¹² Indian Institute of Information Technology, Hyderabad, India.
¹³ Indian Institute of Technology, Delhi, India.
¹⁴ University of Stirling, Stirling, UK.

Abstract

How do language learners avoid the production of verb argument structure overgeneralization errors ( *The clown laughed the man c.f. The clown made the man laugh), while retaining the ability to apply such generalizations productively when appropriate? This question has long been seen as one that is both particularly central to acquisition research and particularly challenging. Focussing on causative overgeneralization errors of this type, a previous study reported a computational model that learns, on the basis of corpus data and human-derived verb-semantic-feature ratings, to predict adults' by-verb preferences for less- versus more-transparent causative forms (e.g., * The clown laughed the man vs The clown made the man laugh) across English, Hebrew, Hindi, Japanese and K'iche Mayan. Here, we tested the ability of this model (and an expanded version with multiple hidden layers) to explain binary grammaticality judgment data from children aged 4;0-5;0, and elicited-production data from children aged 4;0-5;0 and 5;6-6;6 ( N=48 per language). In general, the model successfully simulated both children's judgment and production data, with correlations of r=0.5-0.6 and r=0.75-0.85, respectively, and also generalized to unseen verbs. Importantly, learners of all five languages showed some evidence of making the types of overgeneralization errors - in both judgments and production - previously observed in naturalistic studies of English (e.g., *I'm dancing it). Together with previous findings, the present study demonstrates that a simple learning model can explain (a) adults' continuous judgment data, (b) children's binary judgment data and (c) children's production data (with no training of these datasets), and therefore constitutes a plausible mechanistic account of the acquisition of verbs' argument structure restrictions.

Keywords: English; Hebrew; Hindi; Japanese; K’iche'; causative; child language acquisition; discriminative learning; verb semantics.

Grants and funding

This project has received funding from the European Research Council (ERC) under the European Union’s Horizon 2020 research and innovation programme (grant agreement No [681296]), (project CLASS). Ben Ambridge is Professor in the International Centre for Language and Communicative Development (LuCiD) at The University of Liverpool. The support of the Economic and Social Research Council [ES/L008955/1] is gratefully acknowledged.