Language learning as language use: A cross-linguistic model of child language development.

While usage-based approaches to language development enjoy considerable support from computational studies, there have been few attempts to answer a key computational challenge posed by usage-based theory: the successful modeling of language learning as language use. We present a usage-based computational model of language acquisition which learns in a purely incremental fashion, through online processing based on chunking, and which offers broad, cross-linguistic coverage while uniting key aspects of comprehension and production within a single framework. The model’s design reflects memory constraints imposed by the real-time nature of language processing, and is inspired by psycholinguistic evidence for children’s sensitivity to the distributional properties of multiword sequences and for shallow language comprehension based on local information. It learns from corpora of child-directed speech, chunking incoming words together to incrementally build an item-based “shallow parse.” When the model encounters an utterance made by the target child, it attempts to generate an identical utterance using the same chunks and statistics involved during comprehension. High performance is achieved on both comprehension- and production-related tasks: the model’s shallow parsing is evaluated across 79 single-child corpora spanning English, French, and German, while its production performance is evaluated across over 200 single-child corpora representing 29 languages from the CHILDES database. The model also succeeds in capturing findings from children’s production of complex sentence types. Together, our modeling results suggest that much of children’s early linguistic behavior may be supported by item-based learning through online processing of simple distributional cues, consistent with the notion that acquisition can be understood as learning to process language. (PsycINFO Database Record (c) 2019 APA, all rights reserved)