Centre for Research in Development, Instruction and Training
 

Modelling the acquisition of syntactic categories

Project Team: Steve Croker, Fernand Gobet, Gary Jones, and Julian Pine.

Aims

This research uses EPAM (Feigenbaum & Simon, 1984), a computational modelling theory, to explain various phenomena in children's syntax acquisition. The basic EPAM architecture (which was a theory of perception and learning) is extended to provide a performance-limited distributional account of children's language learning. The various strands of the syntax project use this extended architecture to examine how much of the phenomena associated with children's syntax acquisition can be explained through the relatively simple distributional account that the architecture provides.


Theoretical Background

The basic assumptions are that (1) syntactic categories are actively constructed by the child using distributional learning abilities; and (2) cognitive constraints in learning rate and memory capacity limit these learning abilities. The distributional learning mechanism that has been developed is not only capable of constructing grammatical categories, but also of doing so in a way that is consistent with recent findings in the developmental literature on the sequencing of grammatical category acquisition.


On-Going Research

The project has been designed to complement a large-scale naturalistic study of children's early grammatical development that took place at Nottingham and Manchester. The EPAM architecture has been extended (Gobet & Pine, 1997) to form MOSAIC (Model of Syntax Acquisition in Children). MOSAIC takes as input the utterances from the mother's in the Nottingham/Manchester study and provides a performance-limited distributional account of this input in the form of a discrimination net (a network of nodes which are linked to each other to form a hierarchical structure). The output from MOSAIC consists of utterances that are obtained by traversing the discrimination net. These utterances are remarkably child-like due to the performance limitations that MOSAIC imposes. The project is currently in two strands: (1) Modelling children's early verb use, and (2) modelling the optional infinitive hypothesis.

(1) One of the most influential constructivist accounts of children's early verb use is the verb-island hypothesis (Tomasello, 1992). According to this view children's early grammar consists of inventories of verb-specific predicate structures. That is, their language is built around verbs which take arguments that are specific to the verb. For example, in the sentence "John walks the dog", "John" and "dog" are not seen as subject and object but as "someone who walks things" and "something that can be walked". Whilst consistent with a lot of children's early verb use, the verb-island hypothesis has the problem that lexical items other than verbs (such as pronouns) also seem to be acting as "islands" in children's speech (Pine, Lieven & Rowland, 1998). Data from two of the children and mother's in the Nottingham/Manchester study have been examined, together with data from MOSAIC when trained on each of the mother's utterances (Jones, Gobet & Pine, in preparation; Jones, Gobet & Pine, submitted). The results show that the utterances produced by MOSAIC: (1) more closely resemble the child's data than the child's mother's data on which MOSAIC is trained, and (2) can readily simulate both the verb-island and other-island phenomena which exist in the child's data.

(2) The Optional Infinitive hypothesis proposed by Wexler (1994) is a theory of children's early grammatical development that can be used to explain a variety of phenomena in children's early multi-word speech. However, Wexler's theory attributes a great deal of abstract knowledge to the child on the basis of rather weak empirical evidence. The data from one child from the Nottingham/Manchester study together with the output from MOSAIC when trained using the child's mother's utterances were examined (Croker, Pine & Gobet, 2000). The results show that the child makes errors that are consistent with the optional infinitive hypothesis but also makes errors that are inconsistent with the hypothesis. The results from MOSAIC show that the model makes the same errors as the child, and therefore undermines the claim that Optional Infinitive phenomena require an abstract grammatical analysis.

Further variants of EPAM are being used by our EPAM research group in projects linked to the current one. Two such projects are the modelling of chess expertise (Gobet & Simon, 2000) and the acquisition of multiple representations in physics (Lane, Cheng & Gobet, 1999). These various projects show that EPAM is able to model a variety of phenomena in a wide range of domains.


References

Selected research group publications

External references

Return to Centre for Research in Development, Instruction and Training