Learning Argument Structure Generalizationsadele/papers/Papers/C.doc · Web viewAt the same time,...

57
Learning Argument Structure Generalizations * Adele E. Goldberg University of Illinois Nitya Sethuraman University of California, San Diego Devin M. Casenhiser University of Illinois July 8, 2022 * The authors are grateful for helpful discussions with Cindy Fisher, Gregory Murphy, Dedre Gentner. We thank Aarre Laakso and Dan Jackson for writing programs to help us analyze the corpus data, and Giulia Bencini, Bill Croft, Dan Jackson, Aarre Laakso, Gregory Murphy and two anonymous reviewers for helpful comments. This research was funded in part by the first author’s NSF Grant SBR-9873450. 1

Transcript of Learning Argument Structure Generalizationsadele/papers/Papers/C.doc · Web viewAt the same time,...

Page 1: Learning Argument Structure Generalizationsadele/papers/Papers/C.doc · Web viewAt the same time, when subjects were also provided a written list of co-occurring nouns (in alphabetical

Learning Argument Structure Generalizations*

Adele E. GoldbergUniversity of IllinoisNitya Sethuraman

University of California, San DiegoDevin M. CasenhiserUniversity of Illinois

May 20, 2023

* The authors are grateful for helpful discussions with Cindy Fisher, Gregory Murphy, Dedre Gentner. We thank Aarre Laakso and Dan Jackson for writing programs to help us analyze the corpus data, and Giulia Bencini, Bill Croft, Dan Jackson, Aarre Laakso, Gregory Murphy and two anonymous reviewers for helpful comments. This research was funded in part by the first author’s NSF Grant SBR-9873450.

1

Devin M. Casenhiser, 01/03/-1,
reviewed by JCL: Nov 4, 1999:promised to be back to me by March 4th; finally heard in May
Page 2: Learning Argument Structure Generalizationsadele/papers/Papers/C.doc · Web viewAt the same time, when subjects were also provided a written list of co-occurring nouns (in alphabetical

Abstract

General correlations between form and meaning at the level of argument structure patterns have often been assumed to be innate. Claims of innateness typically rest on the idea that the input is not rich enough for general learning strategies to yield the required representations. The present work demonstrates that the semantics associated with argument structure generalizations can indeed be learned, given the nature of the input and an understanding of general categorization strategies. It is well-established that (non-linguistic) categorization is driven by a functional demand for prediction and is sensitive to frequency and order of acquisition effects. The same elements are argued to account for how semantic generalizations about argument structure constructions can be learned from the input. Furthermore, examination of an extensive corpus study of children's and mother's speech shows that one particular verb is found to account for the lion’s share of instances of each argument frame considered. It is suggested that this high frequency aids learners in initially determining the meaning associated with argument structure patterns.

1 IntroductionFor some time, linguists have observed that within a given language, there exist certain

formal patterns that correlate strongly with the meaning of the utterance in which they appear. Such correlations between form and meaning have been variously described as linking rules projected from the main verb’s specifications (e.g.,Bresnan & Kanerva, 1989; Davis, 1996; Dowty, 1991; Grimshaw, 1990; Jackendoff, 1983; Pinker 1989), as lexical templates overlain on specific verbs (Hovav & Levin, 1998), or as phrasal form and meaning correspondences (constructions) that exist independently of particular verbs (Goldberg, 1995; Jackendoff, to appear).

One way to account for the association of meanings with particular forms i to claim that the association is innate (Gleitman, 1994; Grimshaw, 1990; Pinker, 1984, 1989, 1994). This claim generally rests on the idea that the input is not rich enough for the relevant generalizations to be learned; this is the well-known “poverty-of-the-stimulus” argument (Chomsky, 1980, 1988; Pinker, 1994). This view of learning a grammar can be likened to customizing a software package: everything is there, and the learner simply selects the parameters that are appropriate for his environment (Jackendoff, to appear, Chapter 7). Many have criticized this approach for its biological implausability (Bates & Goodman, 1998; Deacon, 1997; Elman et al., 1996; Sampson, 1997). Moreoever, there have been virtually no successful proposals for what any specific aspect of the parameters might look like (Ackerman & Webelhuth, 1988; Culicover, 1999; Jackendoff, to appear; Newmeyer, 1998).

2

Devin M. Casenhiser, 01/03/-1,
I’m promoting the discussion of the POS and integrating it with the explanation of our stand. Doing so, I also edited out the references to constructionist accounts during the POS discussion. I then demoted the explanation of our constructionist account—I hope that will keep future readers from thinking that the whole argument hinges on the existence of constructions.
Page 3: Learning Argument Structure Generalizationsadele/papers/Papers/C.doc · Web viewAt the same time, when subjects were also provided a written list of co-occurring nouns (in alphabetical

This paper joins the growing body of literature that detracts from the poverty-of-the-stimulus argument by presenting evidence that the nature and properties of at least certain patterns in language are learnable on the basis of general categorization strategies {see also e.g., Taylor, 1995 #134; Bybee, 1983 #32; Bybee, 1982 #31; Jackendoff, 2002 #78; Lakoff, 1987 #83}. In this paper we argue that language input provides more than adequate means by which learners can induce the association of meaning with certain argument structure patterns insofar as well-established categorization principles apply straightforwardly to this domain. In particular, we observe that categorization is known to be driven by a functional demand for prediction and that it is sensitive to frequency and order of acquisition effects. We discuss how these same elements can be used to explain how the association of meaning with particular linguistic patterns can be learned.

Throughout this paper, we adopt constructional terminology, but the ideas we present are not exclusive to a constructionist account. Those who favor one of the other terminologies mentioned above need only construe this account as a proposal for how children can learn linking rules or learn the semantics associated with various lexical templates on the basis of the input. In Error: Reference source not found, we provide a partial list of correlations that exist between form and meaning (lexical templates, combination of linking rules, constructions) along with the labels we use throughout the paper to refer to them.

Construction Label/Example Meaning Form1. Intr. motion (VL) X moves Ypath/loc Subj V Oblpath/loc

The fly buzzed into the room.2. Caused-Motion (VOL) X causes Y to move Zpath/loc Subj V Obj Oblpath/loc

Pat sneezed the foam off the cappuccino.3. Resultative (VOR) X causes Y to become Zstate Subj V Obj RPShe kissed him unconscious.4. Double Object (VOO) X causes Y to receive Z Subj V Obj Obj2Pat faxed Bill the letter.

Table 1. Examples of Correlations Between Form and Meaning

The constructional labels are used only as mnemonics and are not crucial to the present analysis. What is crucial is the uncontroversial notion that there do in fact exist correlations between formal linguistic patterns and meaning.

Before we discuss our corpus findings, we review evidence that children store the associations of meanings with forms on two levels. The first involves the acquisition of verb-centered categories whereby children conservatively produce syntactic patterns on a verb-by-verb basis (Akhtar & Tomasello, 1997; Baker, 1979; Bates & MacWhinney, 1987; Bowerman, 1982; Braine, 1976; Brooks & Tomasello, 1999; Gropen, Pinker, Hollander, Goldberg, & Wilson, 1989; Ingram & Thompson, 1996; Lieven, Pine, & Baldwin, 1997; MacWhinney, 1982; Olguin & Tomasello, 1993; Schlesinger, 1982; Tomasello, 1992). Ultimately, however, generalizations over specific verbs are made, forming speakers’ knowledge of argument structure patterns (Akhtar, 1999; Bowerman,

3

Page 4: Learning Argument Structure Generalizationsadele/papers/Papers/C.doc · Web viewAt the same time, when subjects were also provided a written list of co-occurring nouns (in alphabetical

1982; Brooks & Tomasello, 1999). We aim to account for the existence of these two levels by addressing the question of why learners make the generalizations they do.

We do not address the question of exactly when generalizations emerge in this study; that is, no specific time line is suggested (see Fisher 2002 and Tomasello, to appear for a range of views on this issue). Instead we address the more general question of how it is possible for children to go from specific knowledge of utterances and individual verbs to knowledge of more general linking patterns using general inductive strategies. That children do have both levels of generalization is fairly uncontroversial (evidence is briefly reviewed in sections 2 and 3).

It is clear that before learners are able to form any type of semantic generalization about argument structure, either at the level of the verb or at the level of the construction, they need to have some rough understanding of individual sentence-level utterances. We speculate in the following section that general Gricean principles and non-linguistic context play a critical role. In subsequent sections we return to the main questions this paper aims to address: why should verbs so often be the basis of initial clause-level categorization? and, why and how are children ultimately led to generalize to the level of the construction?

1.1 Initial interpretations of sentencesNon-linguistic context alone must be sufficient for the child to learn the meanings of his or her very first words. Gillette et al. (1998) have shown experimentally that this is plausible for nouns. Adult subjects were shown video clips of mother-child interactions; the sound was turned off except for a beep at the point in the clip when the mother uttered a particular word. Subjects were asked to guess the particular word uttered after witnessing six contexts in which the word was uttered (masked by the beep). They performed fairly well when guessing nouns, averaging 45% correct, but performed much more poorly when guessing verbs, guessing only 15% correctly.

We would like to suggest that verb-centered categories can be learned without necessarily having prior knowledge of verb meaning. We need only assume some partial understanding of a number of nouns or pronouns, and an ability to make hypotheses about what is intended in certain contexts.

We assume the following principle, directly motivated by Grice (1975)’s Maxim of Relevance (“make your contribution relevant”) and Horn’s (1984) R-maxim (“say no more than you must’’):

If a word w is expressed, then the meaning of the word, “w,” is relevant to the situation being conveyed.

From this assumption alone, the child can infer that each known word expressed that refers to an entity indicates a relevant participant in the scene. This is a type of conversational implicature: an inference drawn in context on the basis of what is said and assumptions about the speaker’s communicative goals.1 If we assume the child only

1 It is quite possible that “dumb” attentional processing, as opposed to strategic inferences, may be all that is required insofar as hearing a word serves to direct the learner’s attention to the referent of that word (seeSmith, Jones, & Landau, 1996, for discussion of this idea)

4

Page 5: Learning Argument Structure Generalizationsadele/papers/Papers/C.doc · Web viewAt the same time, when subjects were also provided a written list of co-occurring nouns (in alphabetical

understands you and bed, upon hearing the expression, you mop to bed, in the relevant context, the child might well be able to infer the relevant meaning of the unknown word mop as possibly “go” or “return.”2

The viability of this route to interpretation specifically for verbs has been demonstrated experimentally. As noted above, Gillette et al. (1998) found that subjects performed poorly at guessing verbs in the presence of only non-linguistic context. At the same time, when subjects were also provided a written list of co-occurring nouns (in alphabetical order—subjects were not provided with syntactic frames or even word order), subjects’ performance in correctly guessing the verbs was significantly increased over performance where only scenes or only nouns were available (29% vs 15% and 16.5%, respectively). Further evidence comes from Fisher (1996), who taught three and five year old children novel verbs in different sentence frames and informative non-linguistic contexts. The identity of subject and object referents was obscured by using pronouns that were ambiguous in context (She’s daxing her over there). Children demonstrated a significant sensitivity to the number and type of arguments in interpreting the novel verbs. Finally, Akhtar (1999) taught 2-4 year old children novel verbs in non-English word orders, and found that children had no trouble correctly interpreting sentences even though they only had knowledge of the two noun referents and the non-linguistic context (Nameera Akhtar, personal communication, May 5, 1998).

These findings predict that early on, children will produce errors in interpreting intransitive utterances that contain two recognizable nouns, if context allows for a transitive interpretation. As Fisher (2000) observes, based on data from Hirsh-Pasek and Golinkoff (1996), this prediction appears to be borne out as children younger than 25 months do misinterpret intransitives with conjoined subjects (Watch Big Bird and Cookie Monster flexing) as transitives.

Thus it seems that using nouns to guide the attentional process is a viable route to interpreting early utterances without prior knowledge of verb meaning. We can view this kind of verb learning as using a type of pragmatic bootstrapping—a procedure distinct from both syntactic bootstrapping (Landau & Gleitman, 1985) and semantic bootstrapping (Grimshaw, 1981; Pinker, 1984). Classic syntactic bootstrapping assumes that the child uses information about the structural positions of the nouns, e.g., which one is subject, in determining the verb’s semantic properties (Fisher, Gleitman, & Gleitman, 1991). While we concur that once such structural knowledge is acquired, it is used to infer properties of unknown verbs (see Sethuraman, Goldberg, & Goodman, 1997), we do not assume that the child must have abstract syntactic knowledge or knowledge of the rules of composition at the point when the first argument structure expressions are interpreted.

2 This type of learning is reminiscent of Carey and Bartlett’s (1978) “fast-mapping.” Carey and Bartlett demonstrated that children were able to infer the meaning of an unknown word (chromium) when used as an adjective in a context that contrasted it explicitly with known color terms. Specifically, a teacher asked 3 year old children individually to “show me the chromium tray, not the red one, the chromium one.” Because there were only two trays, one red, one olive, children were able to infer that chromium must be a word for the color olive. However, the present suggestion for verb-learning can apply earlier and more generally since it does not presuppose any comprehension of the syntactic structures of the language, nor a contrastive context, but simply the knowledge of a small number of nouns and/or pronouns.

5

Page 6: Learning Argument Structure Generalizationsadele/papers/Papers/C.doc · Web viewAt the same time, when subjects were also provided a written list of co-occurring nouns (in alphabetical

Pragmatic bootstrapping is also intended to be distinct from semantic bootstrapping, as the latter is traditionally understood. Classic semantic bootstrapping (Pinker, 1989) has been suggested as a way for the child to learn a phrase structure grammar. It assumes that the child is able to initially learn the meanings of both nouns and verbs based solely on the non-linguistic context. Pragmatic bootstrapping suggests a way in which verbs and other grammatical categories can be learned once a handful of nouns are learned. This process can be likened to semantic bootstrapping insofar as any known words that are heard when a sentence is uttered become part of the context, and can therefore be used to help direct a child’s attention to what is being conveyed. We now turn out attention to the question of how generalizations over individual utterances are made.

2 Verb-Centered Categories ExistMany studies have demonstrated that the initial production of argument structure

patterns is very conservative in that children stick closely to the forms they have heard used with particular verbs (Akhtar & Tomasello, 1997; Baker, 1979; Bates & MacWhinney, 1987; Bowerman, 1982; Braine, 1976; Brooks & Tomasello, 1999; Gropen et al., 1989; Ingram & Thompson, 1996; Lieven et al., 1997; MacWhinney, 1982; Olguin & Tomasello, 1993; Schlesinger, 1982; Tomasello, 1992). For example, Olguin and Tomasello (1993) taught 25 month old children four novel transitive verbs, each in a different syntactic pattern: both participants expressed, agent only, patient only or neither argument expressed. Children almost always reproduced the same exact pattern they had heard. Tomasello (1992) observed that by far the best predictor of his child’s use of a given verb on a particular day was her use of the same verb on the previous few days, not, as might be expected, her use of other verbs on the same day. Tomasello and his colleagues have discussed this verb-centered conservatism under the rubric of verb islands, since children readily substitute new nominals into the frames (Akhtar & Tomasello, 1997; Clark, 1996; Gropen, Epstein, & Schumacher, 1997; Tomasello, 1992; Tomasello, Aktar, Dodson, & Rekau, 1997).

There is evidence that adults retain much verb-specific knowledge as well. Verbs are occasionally quite idiosyncratic in the types of argument structure patterns they appear in (Bresnan, 1982; Chomsky, 1965; Pollard & Sag, 1987). For example, the near synonyms help and aid differ in their distribution:

(1) Pat helped her grandmother walk up the stairs.

*Pat aided her grandmother walk up the stairs.

(2) ??Pat helped her grandmother in walking up the stairs.

Pat aided her grandmother in walking up the stairs.

Psycholinguistic studies have demonstrated that speakers are influenced by the relative frequencies with which they have heard particular verbs used in various argument structure constructions (Ford, 1982; MacDonald, Pearlmutter, & Seidenberg, 1993). For example, knowledge that believed is more likely to appear with a clausal complement than with an object complement influences speakers’ on-line interpretation of potentially ambiguous sentences (Garnsey, Pearlmutter, Myers, & Lotocky, 1997; Trueswell,

6

Page 7: Learning Argument Structure Generalizationsadele/papers/Papers/C.doc · Web viewAt the same time, when subjects were also provided a written list of co-occurring nouns (in alphabetical

Tanenhaus, & Kello, 1993). The relative frequencies play a role despite the fact that both possibilities are fully grammatical as in the examples 3a-b:3

(3) Pat believed the speaker might cause a riot.

Pat believed the speaker.

We are suggesting that the earliest argument structure knowledge is, for example, roughly of the form given in (4). The angled brackets are intended to capture the fact that children readily substitute arguments other than the verb early on, subject to semantic constraints.

(4) < actor> put < thing > < location >

Why should verbs be the basis of initial argument structure patterns? That is, why should the verb tend to be the only initially fixed slot in sentences rather than any other single word?

2.1 The predictive value of verbs in argument structure patternsA critical factor in the primacy of verbs in argument structure patterns stems from their relevant predictive value. As many psychologists have emphasized, human categorization is generally driven by some functional pressure, typically the need to predict or infer certain properties on the basis of perceived characteristics (Anderson, 1991; Holland, Holyoak, & Thagard, 1989; Kersten & Billman, 1997; Leake & Ram, 1995; Ross & Makin, 1999; Wisniewski, 1995). That is, cognitive systems do not generalize randomly or completely. Holland et al. (1989), in their monograph on induction, emphasize that generalizations are constrained in that “the inferences drawn by a cognitive system will tend to be...relevant to the system’s goals” (p. 5). In the case of language, the language learner’s goal is to understand and to be understood: to comprehend and produce language. There is ample functional pressure to predict meaning on the basis of given lexical items and grammatical characteristics (comprehension); conversely, there is pressure to predict the choice of lexical items and grammatical characteristics given the message to be conveyed (production). Since the sentences the child is learning to understand and produce form an open-ended set, it is not sufficient to simply memorize the sentences that have been heard. The child must necessarily generalize those patterns at least to some extent in order to understand and produce new utterances.

If we compare verbs with other words (e.g., nouns), verbs are much better predictors of overall sentence meaning. Experimental evidence for this idea is provided by Healy & Miller (1970). Healy and Miller compared the relative contribution of verbs and subject arguments to overall sentence meaning. These two candidates, verb and subject, were presumably chosen because they appear to be the best candidates for representing overall sentence meaning. The subject argument is often referred to as the “topic” argument in a sentence or what the sentence is “about” (Kuno, 1972; Lambrecht, 1994; Reinhart, 1982).

3 Findings of this sort demonstrate that detailed verb-specific knowledge about frequencies of usage clearly influence adult grammar. Knowledge of specific verbs continues to be useful in several ways. Verbs are required for more specific interpretation, for example the contrast between give and hand, pass, throw, or send. Verb-centered categories are also required for occasional idiosyncratic patterns (e.g., help vs aid). Moreover, as many have observed, individual verbs play a key role in determining exceptions to constructional generalizations (see below).

7

Page 8: Learning Argument Structure Generalizationsadele/papers/Papers/C.doc · Web viewAt the same time, when subjects were also provided a written list of co-occurring nouns (in alphabetical

At the same time, the verb provides a great deal of information about who did what to whom. Healy and Miller constructed 25 sentences by crossing 5 subject arguments (the salesman, the writer, the critic, the student, the publisher), 5 verbs (sold, wrote, criticized, studied, published) and one patient (the book). Participants were asked to sort the sentences into five piles according to similarity in meaning. Results showed that participants sorted sentences together that had the same verb much more often than sentences that had the same subject argument (p < .008). That is, for example, all five sentences with the verb criticized were categorized together much more often than five sentences with the subject the critic. Healy and Miller concluded that the verb is the main determinant of sentence meaning4.

Another source of evidence for the idea that the verb is a good predictor of sentence meaning comes from work on analogy. It has been richly documented that relational aspects of meaning are fruitful sources of analogy and similarity judgements (Gentner, 1982). Markman and Gentner (1993), for example, found that in making non-linguistic similarity judgments, similarity is judged to be greater when two representations are related in a systematic way via the relations that hold between the entities in each representation. The entities are aligned based on the structure that relates them, rather than on the basis of independent characteristics of the entities. The relevance to language is straightforward. In comparing two sentences, the main relational predicates, the verbs, are more likely to be used than the independent characteristics of the arguments (see Tomasello, 2000, for similar observations). Why should this be so? The purpose of analogies is generally one of drawing inferences and making predictions: what can be predicted on the basis of one situation about another situation (Gentner & Medina, 1998). Thus it is the value of verbs as good cues to sentence meaning that results in the child’s early learning of verb-centered argument structure patterns (verb islands).

Many verbs appear frequently in more than one construction, with concomitant variation in meaning. For example, while get seems to mean “acquire” in She got it, it is better interpreted as “make” in She got him angry; go appears frequently as a future marking auxiliary in addition to being the most frequent verb in the intransitive motion construction. It must be borne in mind that it is not the verb in isolation that is claimed to be highly predictive of sentence meaning, but rather the use of a verb in a particular construction. We return to this idea in section 3.5. As was outlined initially in section 2, the early verb islands involve verbs in particular argument patterns or constructions, not verbs in isolation.

Given the fact that early learning is based on particular verbs in constructional patterns, we can now turn to the questions of how and why learners ever generalize away from the verb to the more abstract level of argument structure constructions. Before focusing on this issue, we first review evidence that generalizations over particular verbs are in fact eventually made.

4 We revisit this experiment in section 3.5.2.

8

Page 9: Learning Argument Structure Generalizationsadele/papers/Papers/C.doc · Web viewAt the same time, when subjects were also provided a written list of co-occurring nouns (in alphabetical

3 Learning constructional generalizations

3.1 Generalizations over verbs existIf generalizations were not necessarily made, we might expect to find languages whose argument structure patterns varied arbitrarily on a verb-by-verb basis. For example, we might expect to find one semantically transitive verb expressed by SVO (Subject Verb Object) word order, another expressed by SOV order, and a third verb expressed by VSO order:

(5) Pat saw Chirs.

Pat Chris kissed.

Hate Pat Chris.

But in fact languages are much more regular. Semantically similar verbs show a strong tendency to appear in the same argument structure constructions. Help and aid cited above are unusual; more typically, verbs that are closely related semantically do appear in the same argument structure constructions (Fisher et al., 1991; Goldberg, 1995; Gross, 1975; Levin, 1993; Pinker, 1989).

Further evidence that children generalize the patterns they use stems from the fact that they occasionally produce spontaneous overgeneralizations. The following examples come from Bowerman (1982):

(6) Christy 4;0: Will you have me a lesson? (VOO pattern)

Christy 3;4: She came it over there. (VOL pattern)

It is also clear that adults continue to spontaneously generalize argument structures patterns (Aronoff, 1976; Clark & Clark, 1979; Pinker, 1989). The attested examples in (7) provide examples of such adult overgeneralizations:

(7) “Once you resort to higher-level predicates, you can just lambda your way out of practically anything.” (reported by John Moore, May 1995)

“He concentrated his hand steady.” (reported by Georgia Green, found in Russell Atwood’s East of A, NY: Ballentine Books 1999).

“I’ll just croak my way through, I guess.” (reported by Mike Tomasello, May 1996)

[ADD EXAMPLES FROM PINKER 1989 HERE?]

The successful manipulation and comprehension of nonsense verbs in experimental settings also demonstrates that speakers are in fact able to make generalizations (Akhtar & Tomasello, 1997; Gropen et al., 1989; Naigles, 1990).

9

Page 10: Learning Argument Structure Generalizationsadele/papers/Papers/C.doc · Web viewAt the same time, when subjects were also provided a written list of co-occurring nouns (in alphabetical

In the following section we take a closer look at how children come to associate the particular semantic properties with the argument structure patterns that they do.

3.2 General Purpose VerbsAs represented in Table 2, the meanings of certain verbs as used in particular argument structure constructions bear a striking resemblance to the meanings independently posited for those argument structure constructions (as discussed in the introduction and summarized in Error: Reference source not found).

put X causes Y to move Z VOL: Caused Motiongo X moves Y VL: Intransitive Motiondo X acts on Y VO: Transitivemake X causes Y to become Z VOR: Resultativegive X causes Y to receive Z VOO: Ditransitive

Table 2. Main verbs and the constructional meanings they correspond to

“General purpose” verbs, including “put, go, do, make” are among the first and most frequent verbs in many languages (Clark, 1978, 1996). Clark cites data from Bowerman (1973) for Finnish, Grégoire (1937) for French, Sanchés (1978) for Japanese, and Park (1977) for Korean; Ninio (1999), discussed below, provides similar data from Hebrew.

The generality of the meanings of these verbs and their early appearance in children’s speech suggests that they may aid children in generalizing patterns from the input. Speculations about the close relationship between certain verbs and argument structure patterns have been made previously by leading researchers in linguistics and language acquisition. Fillmore et al. (ms. section 3.2) observe that “it is possible to think of the AS [argument structure] constructions as in some sense ‘derived from’ the semantics of their most neutral verbs” (see also Goldberg, 1998). Clark (1996) likewise speculates that certain early-learned verbs may serve as “templates” for further acquisition on the basis of their semantic characteristics. She demonstrates that children are aware of much relevant semantic knowledge pertaining to their early verbs as evidenced by their discriminating use of various inflectional morphemes. Ninio (1999) has also suggested that syntactic patterns emerge from generalizing the use of particular verbs. With the possible exception of Ninio (1999), discussed in section 3.4.3, these researchers do not attempt to flesh out this idea; as Fillmore et al. (ms.) note, “when we come to propose this seriously we will have to specify just what sort of ‘projection’ we are talking about...and what the mechanism is according to which the pattern of the verb is projected to the more general pattern.”

The present proposal is that it is the high frequency of particular verbs in particular constructions that allows children to get an initial ‘fix’ on the meanings associated with clausal constructions. We will see below that a single verb accounts for the lion’s share of tokens of each particular construction considered. In subsequent sections we address the question of what induces the child to generalize away from particular verbs to form an abstract linking generalization. The relevant mechanism involved is one of general categorization.

10

Page 11: Learning Argument Structure Generalizationsadele/papers/Papers/C.doc · Web viewAt the same time, when subjects were also provided a written list of co-occurring nouns (in alphabetical

3.3 Children’s Early Use of Verbs in the Bates CorpusDatabase

In order to examine more closely children’s early uses of particular constructions, we investigated a corpus of children’s early speech. The main language corpus used in this study is the Bates corpus (Bates, Bretherton, & Snyder, 1988) on the Child Language Data Exchange System (CHILDES) database (MacWhinney, 1995). This corpus contains transcripts from the Bates/Bretherton Colorado longitudinal sample of 27 middle-class children, 13 boys and 14 girls at age 20 and 28 months. There are transcripts for 15 minutes, equally divided into three types of mother-infant interaction: free play, reading of the book Miffy in the Snow, and snack time.

Data collection and coding

The speech of all 27 children at 28 months and the speech of 15 mothers to their children aged 28 months was extracted from the Bates et al. transcripts. Utterances were collected and hand-coded independently by the second two authors and then combined into one full list. Disagreements were resolved through discussion with the first author. Agreement between the two independent coders was 96.5% initially and 100% after discussion (n = 840) for classifying children’s utterances by construction type.

The speech of fewer adults was analyzed because there were more data points for each adult; i.e., parents spoke more than the children. A second set of 7 mothers was coded separately by the second author in order to determine whether there was any bias in our sample of 15 mothers. Identical trends were noted in the second sample so we used the sample of 15 mothers that had been coded independently by two coders.

Word and sentence segmentation decisions were respected. All complete child and adult utterances containing a verb were included in the coding. Four columns were used to describe every utterance: construction, verb, speaker+situation+age, and the actual utterance. Each utterance was coded in this way for verb and construction pattern. All patterns were coded, but three patterns are focused on in this paper:

Label FormVL: (Subj) V Oblpath/loc

VOL: (Subj) V Obj Oblpath/loc

VOO: (Subj) V Obj Obj2

After an initial sampling of sentences with overt subjects and sentences without separately, we collapsed the results because the only major difference between the two sets of data was that sentences without subjects were predominantly commands. There did not seem to be other distinctions relevant to the current discussion between the two sets of data.

Classifications were based primarily on form: categorizing an utterance as an instance of the VOO pattern required a verb and two NPs. Categorizing an utterance as an instance of the VL pattern required that there be a verb and some type of locative: a

11

Page 12: Learning Argument Structure Generalizationsadele/papers/Papers/C.doc · Web viewAt the same time, when subjects were also provided a written list of co-occurring nouns (in alphabetical

preposition phrase indicating location, a particle indicating location (e.g., down, in), a locative (there, here), or some combination. Thus, She lived in Pennsylvania and She went there would both be considered an instance of the VL formal pattern while She lived with her sister would not be because “with” is not a locational preposition. The VOL pattern required a verb with an object NP and some type of locative phrase. The emphasis on form in our coding scheme was intended to avoid presupposing that utterances had the intended meaning associated with the linking generalizations or constructions given in Table 1. In section 3.5.1 we analyze to what extent the form-based patterns of VOO, VL, and VOL actually do correlate with the suggested meanings and whether these patterns are more or less predictive of actual meaning than the verbs that occur in individual utterances. The decision to code for locative phrases (“L”s) instead of a purely syntactic category such as PPs was based on the fact that ultimate linking generalizations are known to be sensitive to general semantic properties of the oblique argument such as what general type of preposition is involved. Since at least certain prepositions and locative expressions (including up, down, off, in, on and here) are often among the first words uttered (Clark 1996), we assume that children are able to identify phrases as locative or non-locative.

We ignored variable word orders correlated with questions, topicalizations, etc. Questions such as What did she put in his eyes?, for example, were considered instances of the VOL pattern. **Utterances with such non-canonical word orders accounted for only XX% of the data***Nitya, could you calculate this on at least two subsets of the data?. Utterances considered unacceptable or ungrammatical to adults were included in the coding. We did not attempt to distinguish arguments from adjuncts because we did not want to assume the 28 month old children had mastered the distinction.

A sample from the coded transcript is given below:

VOL pour PAU-SN28:       pour some milk in it .VOL: throw: *HAN-SN28: throw diaper away !VOL: put:     *GEO FR28     I put him in .VOL: put:      *KEI -FR28:       put another ball in here .VOL: want: *WAN-ST28: want you in the house .VOO: give:    *GLO-SN28:      Cory gave me my turtle .VOO: give:    *EDD SN28:       give me some milk ?VL: get:     *KEI ST28:       she get in her bed .VL: go:      *CHU FR28:       it went in here .VL: come: *FRA-FR28: people come outside.

Table 3. Sample from the Bates et al. (1988) corpus

Results

In the children’s speech we find that the verbs that correspond to a construction’s meaning as given in Table 3 were overwhelmingly the most frequent verbs used in that

12

Page 13: Learning Argument Structure Generalizationsadele/papers/Papers/C.doc · Web viewAt the same time, when subjects were also provided a written list of co-occurring nouns (in alphabetical

construction.5 For example, the VL construction, exemplified by the sentence, I went into the store, has roughly the meaning corresponding to that of the verb go as used in that pattern:

(Subj)VOblloc

X MOVES Yloc “go”

Correspondingly, the verb go accounted for a full 54.02% (121/224) of the instances of that construction in the children’s speech. Other verbs occur in the construction, but with much less frequency:

go 54.02% (121/224)

get 6.25%fall 5.36%come 4.91%look, live, sit 3.57% each

A similar pattern emerged in the case of the VOL construction, exemplified by the sentence, I put the book on the table. This construction has semantics closely related to the lexical semantics of the verb put (cf.Goldberg, 1995); it turns out that put accounts for a full 31.37% (16/51) of the instances of the construction in the children’s speech.

(Subj) V Obj Oblpath/loc X causes Y to move Zpath/loc “put”

Again, other verbs appear in the construction, but with markedly less frequency:

put 31.37% (16/51)

get 15.69%take 9.80%do 5.88%pick 5.88%

The ditransitive or VOO construction, illustrated by the sentence Pat gave Chris a book, is closely associated with the semantics of transfer, or the meaning of the lexical verb, give (see Pinker 1989; Goldberg, 1992, 1995 for discussion of this and slight variations of constructional meaning, as well as general references).

5 Zipf long ago noted that highly frequent words account for most linguistic tokens (Zipf, 1935). Although he did not claim that there should be a single most highly frequent word that would correspond to the meaning of the overall pattern, nor did Zipf’s work prepare us for the fact that a single verb accounts for such a lion’s share of the tokens, his observation suggests that we may find a similar pattern in constructions other than argument structure constructions.

13

Page 14: Learning Argument Structure Generalizationsadele/papers/Papers/C.doc · Web viewAt the same time, when subjects were also provided a written list of co-occurring nouns (in alphabetical

(Subj)V Obj Obj2X CAUSES Y TO RECEIVE Z “give”

Only 6 instances of the VOO construction were found in the children’s speech in the Bates et al. corpus. These included 2 instances of give, 2 of make, and 2 of bring. This paucity of data is due to the fact that the VOO construction is only just beginning to be learned at 28 months. We therefore examined data collected by by Gropen et al. (1989) from longitudinal data in the Brown (1973) corpus of Adam, Eve, and Sarah together with MacWhinney’s (1995) data on Ross and Mark. Gropen et al. searched these corpora for instances of the VOO construction. Adam was recorded in 55 two hour samples taken every 2-4 weeks between age 2;3 and 4;10. Eve was recorded in 20 two hour samples taken every 2-3 weeks between age 1;6 and 2;3. Sarah’s speech was recorded in 139 one hour samples taken at 2-19 day intervals between the ages of 2;3 and 5;1. Ross and Mark were both recorded by their father Brian MacWhinney in 62 samples of varying sizes at varying intervals, Ross between the ages of 2;6 and 8;0, and Mark between the ages of 0;7 and 5;6.

As seen in Table 4, give is the most frequent verb to appear in this construction for the majority of children. The numbers for give do not include uses that Gropen et al. considered idiomatic. In the case of one child, Mark, there is no statistical difference between give and tell; the data set is too small for this particular child to draw any clear inferences. The overall frequency of give in the double object construction in this child’s speech outnumbers tell if idiomatic uses were included.

Child Most frequent verb in VOO

Percentage of Tokens Total # of Verb Types

Adam give 52.7% (59/112) 13

Eve give 36.4% (4/11) 5

Sarah give 43.3% (29/67) 12

Ross give 43.1% (69/160) 13

Mark tell

give

32.4% (11/34)

29.4% (10/34)

8

Table 4. The most frequent verb in ditransitive (VOO) constructionbased on Gropen et al. (1989)

To summarize the data so far, it is clear that the most frequent and early acquired verb corresponds to the semantic prototype of the construction for each of the three constructions examined.

14

Page 15: Learning Argument Structure Generalizationsadele/papers/Papers/C.doc · Web viewAt the same time, when subjects were also provided a written list of co-occurring nouns (in alphabetical

3.4 The role of high token frequency in categorizationResearch in cognition has demonstrated that there is a strong correlation between the frequency with which a token occurs and the likelihood that it will be considered a prototype by the learner (Nosofsky, 1988; Posner & Keele, 1968; Rosch & Mervis, 1975). Homa, Dunbar and Nohre (1991) found that token frequency was an important variable at early and intermediate stages of category learning, with increased token frequency facilitating category learning. In learning generalizations about dot patterns, Posner, Goldsmith and Welton (1967) demonstrated that the rate at which subjects classified patterns correctly was a direct function of the amount of distortion from their respective prototypes: the less variability or distortion, the faster the category was learned.

Elio and Anderson (1984) set up two conditions relevant to the current discussion. In the “centered” condition, subjects were initially trained on more frequently represented, more prototypical instances, with the study sample growing gradually to include more members of the category.6 In the “representative” condition, subjects were trained on a fully representative sampling from the start. In both conditions, subjects were eventually trained on the full range of instances. Elio and Anderson demonstrated that the order in which subjects received the more prototypical instances played a role in their learning of the category. In particular, they demonstrated that categories were learned more accurately in the “centered” condition; the “representative” condition yielded poorer typicality ratings and accuracy during the test phase on new instances. Elio and Anderson observe, “The superiority of the centered condition over the representative condition suggests that an initial, low-variance sample of the most frequently occurring members may allow the learner to get a ‘fix’ on what will account for most of the category members.” (p. 25) They go on to note that “A low-variance sample, in which there is a maximum amount of similarity among items, [is] particularly conducive to forming strong category generalizations.” (p. 28).7

Similar results were found by Avrahami et al. (1997) who demonstrated that subjects learned categories better when presented with several ideal positive cases followed by borderline cases than if they were presented with sequences which emphasized category boundaries from the start.8

It is important to bear in mind that we are addressing the question of how the argument structure generalization is initially learned, and not the question of how the learner knows when to extend the pattern for use with new verbs, that is, the question of productivity.

6 The study involved descriptions of people belonging to one of two clubs, with members’ descriptions varying on five 4-valued dimensions.7 Interestingly enough, Elio and Anderson (1984) also found that when subjects were explicitly asked to form hypotheses about what criteria governed category membership, the advantage of learning the centered instances first disappeared. They therefore conclude that the advantage is only evident when the learning is implicit. Implicit learning involves knowledge that is not accessible to consciousness, fairly complex and abstract, an incidental consequence of some task demand, and preserved in case of amnesia (Seger, 1994). Relevantly, language learning is an excellent example of implicit learning, since it is largely learned below the level of consciousness, is very complex and ultimately quite abstract, is a consequence of trying to communicate, and is preserved in cases of amnesia.8 Stimuli in this experiment also consisted of non-linguistic stimuli: variable sized semi-circles with variably oriented radial lines.

15

Page 16: Learning Argument Structure Generalizationsadele/papers/Papers/C.doc · Web viewAt the same time, when subjects were also provided a written list of co-occurring nouns (in alphabetical

There has been much discussion in the literature about productivity and we do not attempt to review it all here (Braine 1979; Baker, 1979; Bowerman, 1990; Pinker, 1989; Goldberg, 1995; Brooks & Tomasello, 1999). Three factors have been proposed in the literature as relevant to predicting a pattern’s productivity: the absolute number of distinct items that occur in a given pattern or a pattern’s type frequency (Bybee, 1985; Goldberg, 1995; MacWhinney, 1978; Plunkett & Marchman, 1991, 1993); the variability of the items that occur in a given pattern: a pattern’s degree of openness (Bybee, 1995; Janda, 1990); and preemption: the existence of a competing pattern (Brooks & Tomasello, 1999; Goldberg, 1995; Pinker, 1981).

For the present purposes, we simply wish to observe that knowing what an argument structure pattern means is another relevant factor in that it is a necessary condition for extending that pattern. That is, learners cannot readily extend a pattern without having a ‘fix’ on what the pattern means. It is suggested that the high token frequency of a single verb in a particular formal pattern should facilitate the learning of the meaning of the abstract pattern.

To summarize, we know that frequency and order of acquisition play key roles in category formation in that training on prototypical instances, frequently and/or early, facilitates category learning (Bruner, Goodnow, & Austin, 1956; Kruschke, 1996; Maddox, 1995; Nosofsky, 1988). While we have not demonstrated a causal relationship, we suggest that the semantics associated with a pattern can be learned straightforwardly because of the very frequent and early use of one verb in the pattern. For example, after using many sentences with put in the VOL construction as in (8), children come to associate the meaning of put with the construction even when the verb is not present as in (9):

(8) She put a finger on that.

(9) He done boots on. {STE, 28 months, Bates, 1988 #20}

The result is that the meaning of roughly “X causes Y to move Z loc” comes to be associated with the Subj V Obj Oblpath/loc formal pattern.

We need to emphasize that we are not claiming that general purpose verbs are necessarily the very first verbs uttered. Our cross-sectional corpus data clearly does not address this question and empirical studies have yielded differing answers to the question (see Ninio 1999 for a claim that such verbs should be the very first verbs uttered and see Campbell and Tomasello 2001 for evidence that they are not always the very first verbs).

It also must be noted that it is not a logical necessity for forming a constructional category that there should be a single verb with frequency far greater than other verbs. The correlation between form and meaning could in principle be learned by noting the association of form and meaning across several distinct verbs, each with relatively low frequency. Strikingly asymmetric frequencies are found in the data, however, with one verb having far greater frequency than other verbs for each of the constructions discussed here. Given the facts about general categorization reviewed in this section, it is reasonable to assume that the especially high frequency makes the learning of the

16

Page 17: Learning Argument Structure Generalizationsadele/papers/Papers/C.doc · Web viewAt the same time, when subjects were also provided a written list of co-occurring nouns (in alphabetical

correlation that much easier: the meaning associated with the construction requires minimal abstraction from the semantics of the verb as used in the construction. The highly frequent verb serves as a readily available prototype with which other verbs may be associated.

3.4.1 Accounting for the high frequencyThe question arises as to why these verbs are used more frequently in these constructions by children. One factor of course is that these verbs are among the most frequent verbs in the language at large (Carroll, 1971 #36). But this does not in itself predict that these verbs should account for the lion’s share of tokens of any single construction since most frequent verbs appear in multiple constructions.

More directly relevant is the fact that the same pattern described above in children’s speech also holds in the input: mothers’ speech to children. That is, the use of a particular construction is typically dominated by the use of that construction with one particular verb. For example, go accounts for a full 38.5% of the uses of the VL construction in the speech of mothers addressing 28-month-olds in the Bates et al. (Bates et al., 1988) corpus. This high percentage is remarkable since this construction is used with a total of 39 different verbs. Percentages of general purpose verbs in particular constructions are given below in Table 5.

Construction Mothers Total Number of Verb Types

1. Subj V Obj 38.5% go (136/353) 39 verbs

2. Subj V Obj Obl 40% put (100/250) 43 verbs

3. Subj V Obj Obj2 20.37% give (11/54) 14 verbs

Table 5. 15 Mothers’ most frequent verb and number of verb types for 3 constructions in Bates et al. (1988) corpus9

We know from previous studies that children’s use of verbs is highly sensitive to their mothers’ use of the verbs (Choi, 1999; De Villiers, 1985; Tomasello and Kruger 1992; Naigles & Hoff-Ginsberg, 1998); thus it is not surprising that the high frequencies of the children’s use reflects the high frequencies of the mothers’ use.

3.4.2 Accounting for the high frequency in the inputThe fact that go, put and give are so frequent in the input raises the question as to why that should be so. There seem to be two reasons. First, if we compare for example, go with amble, or put with shelve, it is clear that go and put are more frequent because they

9 Tell, in our small sample of 54 appeared an equal number of times as give. We believe the high number of instances of tell is an artifact of the book-reading context, since only one instance occured in a non-story context. The other 10 occurrences are all directly related to the task of reading the story; in fact the second object in 8 out of 10 instances is story.

17

Page 18: Learning Argument Structure Generalizationsadele/papers/Papers/C.doc · Web viewAt the same time, when subjects were also provided a written list of co-occurring nouns (in alphabetical

apply to a wider range of arguments and therefore are relevant in a wider range of contexts (Bybee, Perkins, & Pagliuca, 1992; Heine, 1993; Zipf, 1935).

In addition, each of the main uses of these verbs designates a basic pattern of experience, for example, someone causing someone to receive something (give), something causing something to move (put), or someone acting on something (do). These meanings are readily accessible to the child, since they involve concrete actions. Note that the verb be, while being generally applicable and in fact highly frequent in adult language, is learned rather late by children. This is likely because the meaning contributed by be is less easily accessible, since it is so abstract and dilute. Thus the verb meanings need to be accessible as well as highly frequent in the input in order to be frequently produced in early child language (Slobin, 1985).

3.4.3 “Pathbreaking verbs” analysisThe present proposal focuses on how the semantics associated with argument structure patterns may be learned. In a related proposal, Ninio (1999) notes that the early use of certain verbs may allow the child access to initial syntactic generalizations as well.10

Ninio analyzes early uses of SVO and VO patterns in two Hebrew-speaking populations: 15 children in a longitudinal study and 84 18-month-old children in a cross-sectional study.11 She notes that children often begin using a single verb with a direct object long before a direct object appears with other verbs; moreover, she notes the overwhelming tendency for these “pathbreaking” verbs to be drawn from the set of general purpose or light verbs. In particular, in both the longitudinal and cross-sectional studies, the children tended to use verbs meaning “want,” “make/do,” “put,” “bring,” “take out,” or “give” before other verbs were used.

In the longitudinal study, Ninio further observes that SVO and VO patterns were initially produced with only one or at most a few verbs for a prolonged period lasting between 2 and 15 weeks. More and more verbs came to be used in an exponentially increasing fashion; that is, there seemed to be more facilitation after ten verbs than after five and so on. She suggests that this increase stems from the fact that children gradually abstract a more general and purely syntactic pattern on the basis of the early verbs, and that the growing generalization allows them to use new verbs in this syntactic pattern more and more easily.

10 See Cartwright and Brent (1997) and Morris and Cottrell (1999) for suggestions of how more intricate syntactic category and grammatical relation properties may be acquired on the basis of distributional properties of the input.11 Ninio also includes data from a single English speaking child: Travis (fromTomasello, 1992); however the English data presented is highly equivocal, since one of Travis’ first two SVO expressions (Big Bird ride horsie), involved “ride,” a word unattested in any of the Hebrew speaking children’s early transcripts. Travis’ second VO combination was Find-it funny which does not clearly involve an object, since it is parsed by Ninio as a non-referential verbal clitic. Also, only one out of four of the earliest VO and SVO utterances involved the idea “obtaining” although this idea is said to account for 78.1% of the earliest Hebrew utterances. None of these discrepancies are material to the current proposal, since we do not claim that general purpose verbs are necessarily the very first verbs uttered.

18

Page 19: Learning Argument Structure Generalizationsadele/papers/Papers/C.doc · Web viewAt the same time, when subjects were also provided a written list of co-occurring nouns (in alphabetical

On both Ninio’s account and the present proposal, patterns are learned on the basis of generalizing over particular instances. As vocabulary increases, so does the strength of the generalization, making it progressively more and more easy to assimilate new verbs into the patterns. With Ninio, it is argued here that the instances that play a crucial early role are those involving general purpose verbs.

The two accounts complement each other in that Ninio proposes that general purpose verbs lead the way in the early acquisition of syntax, and the present proposal emphasizes the role of general purpose verbs in the acquisition of the semantics associated with basic syntactic patterns. It may well be that early uses of general purpose verbs provide the foundation for both initial syntactic and semantic generalizations, and thus provide a route to the acquisition of form and meaning correspondences: i.e., constructions.

The accounts differ in their explanations as to why general purpose verbs should be learned so early. While Ninio notes that general purpose verbs are highly frequent and pragmatically relevant, she argues that the tendency for general purpose verbs to be used early in the VO and SVO patterns stems largely from a high degree of semantic transitivity in these general purpose verbs. She states, “The ‘pathbreaking verbs’ that begin the acquisition of a novel syntactic rule tend to be generic verbs expressing the relevant combinatorial property in a relatively undiluted fashion; this is what makes them such good candidates for acquisition” (1996); these verbs are argued to necessarily express “a fundamental, unalienable, core notion of [semantic] transivity” (Ninio, 1996; see alsoNinio, 1999).

This proposal requires that the general purpose verbs that appear in the (S)VO pattern early are all highly semantically transitive, since it is this semantic property of the verbs that is argued to encourage their appearance as syntactically transitive verbs. However, as Ninio herself notes, many of the early general purpose verbs are not highly semantically transitive according to traditional criteria; i.e., they do not involve an agent acting on, or changing the state of a patient argument (see e.g., Hooper & Thompson, 1980). For example, the general purpose verbs, want, see, get and have appear among the very first verbs in Ninio’s corpus, and yet they are not highly transitive according to traditional criteria.

Ninio discusses this discrepancy at some length and argues that a notion of unmarked, “core” or “inalienable” transitivity is at issue, and not the traditional understanding of semantic transitivity which may be expressed with some type of morphological marking.12 In describing what the notion of “core” transitivity involves, she notes that many of the general purpose verbs describe situations that are more likely to result in a change in the subject argument than the patient argument (Ninio, 1996, ; 1999). In fact, Ninio observes that many early transitive utterances are “semantically speaking...in fact not very far away from intransitive sentences” (Ninio, 1996). While we clearly need to be sensitive to descrepencies between child and adult perception of events, it remains difficult to see how verbs that are acknowledged to be fundamentally similar to intransitive verbs could be actually the unmarked transitive verbs.

12 Silverstein (1976) observed that patients that are stereotypical in terms of being inanimate and causally affected are often unmarked, pace Ninio.

19

Page 20: Learning Argument Structure Generalizationsadele/papers/Papers/C.doc · Web viewAt the same time, when subjects were also provided a written list of co-occurring nouns (in alphabetical

In support of the claim, Ninio argues that the children’s first transitive verbs are the verbs that typically get grammaticalized as transitivity markers cross-linguistically. But while general purpose verbs do clearly often get grammaticalized as various kinds of auxiliaries (Bybee et al., 1992; Heine, 1993), their role as transitivity markers per se is far less clear. As examples of transitivity markers in Indo-European languages, the perfect marker “have” and the set of Indoaryan complex verbs that involve a general purpose verb concatenated with a noun or other host are cited. However, the “have” perfect marker, (e.g., Italian avere), applies to agentive intransitive verbs as well as to transitive verbs. And the general purpose verbs involved in complex predicates typically include a full range of general purpose verbs. For example, “become” (shodan), “come” (âmad), and “go” (raftan) all serve as general purpose verbs in complex predicate formation in Persian alongside transitive verbs (e.g.,Goldberg, 1996; Windfuhr, 1979).

Do and give are two general purpose verbs that often become grammaticalized as transitivizers or causativizers, but give is actually acquired later than many other verbs, most likely because of its more complex syntax. Other transitivizer or causativisers tend to be directionals (Song, 1996; Wolfenden, 1929) or verbs meaning “to cause” (Givón, 1971). Lacking reason to suppose that the first learned transitive general purpose verbs are necessarily semantically more transitive than other verbs, the proposal ultimately does not explain why general purpose verbs should appear with transitive syntax before other verbs.

On the present account, meanings of constructions emerge directly from generalizations over particular verbs. Since the transitive pattern appears with a range of highly frequent verbs, including verbs with low semantic transitivity such as see, get and want, the association of semantic transitivity with simple syntactic (S)VO status is predicted not to be overwhelmingly strong. In fact this is the case, as the (S)VO pattern is associated with a very wide range of meanings (e.g.,Davis, 1996; Dowty, 1991).

Another difference between the present account and Ninio (1999) is that we do not presuppose that general purpose verbs are necessarily the very first verbs used in syntactic patterns (see in fact Campbell and Tomasello 2001 for evidence that this is not always the case). Our claim concerns frequency: one general purpose verb accounts for the lion’s share of tokens in each of the constructions considered. As discussed in the previous section, the fact that general purpose verbs are used so frequently results from their high frequency in the input language and their accessible meanings. The high frequency in the input in turn stems from their generally applicable and highly relevant meanings.

The present account generalizes to other general purpose verbs that are not transitive, but are also very frequent, and can be seen to form the basis of argument structure meaning. For example, we have seen that the verb go is the most frequent verb used in the VL construction and corresponds to the meaning of that construction. The same is true of the pattern, (Subj) V Obj1 Obj2, which comes to be associated with the meaning of “give,” the pattern, (Subj) V Obj Oblpath/loc, which comes to be associated with the meaning of “put,” and so on. More generally, the specific formal patterns associated with particularly frequent verbs come to be associated with the meanings of those verbs. The alternative is to assume that each of these patterns and its associated meaning were

20

Page 21: Learning Argument Structure Generalizationsadele/papers/Papers/C.doc · Web viewAt the same time, when subjects were also provided a written list of co-occurring nouns (in alphabetical

known to the child at the time of the child’s first verbs. How the constructions themselves come to exist would not be explained.

3.5 The value of constructions as predictors of sentence meaning

We have argued that the high token frequency of one particular verb in a constructional pattern facilitates the learner’s getting a ‘fix’ on the intended interpretation. However, Bybee (1995) has argued that morphological tokens with especially high frequency do not lead to generalizations because they are routinized to such an extent that their internal structure is unanalyzed and therefore unavailable for analogical extension. Bybee’s argument is based on irregular morphological items such as went and am which clearly do not lend themselves to generalizations. It is quite possible that such morphological forms are used without internal analysis.

However, it is clear that the constructions under discussion represent a different case. They must be analyzed because they contain argument positions that must be filled. The VOL pattern with put, for example, has slots that are filled by different arguments from one use to the next. Psycholinguistic experiment has revealed that even frequent VP idioms such as kick the bucket are analyzed in on-line sentence comprehension (Pederson et al. 2001). In fact, VP idioms have to be analyzed since their internal structure is minimally variable: the verb may be variably marked for tense or agreement (kick the bucket; kicked the bucket; kicks the bucket).

At the same time, as emphasized above, the existence of high token frequency verbs does not entail that the pattern will necessarily be generalized away from that one particular verb. In the rest of the paper we demonstrate that the generalization away from a particular verb to a more general pattern is useful in predicting overall sentence meaning, more useful in fact than knowledge of individual verbs. As observed early on in section 2.1, it is prediction that drives generalization. Thus we claim that it is the predictive value that encourages speakers to generalize away from knowledge of specific verbs to ultimately learn the semantic side of linking generalizations, or constructional meaning. Specifically, while the verb together with its argument structure provide an extremely reliable interpretation as discussed in section 2.1, we will see below that the construction is a better predictor of overall sentence meaning than the morphological form of the verb. Thus there is reason to expect to find an increased reliance on the constructions as a cue.

Precedent for this idea again comes from work in the non-linguistic categorization literature. Kruschke (1996) and Dennis and Kruschke (1998) discuss how learners shift attention away from non-distinctive cues toward distinctive cues, when learning overlapping instances that belong to distinct categories. For example if two diseases share one symptom but have their own distinctive symptom, subjects will attend more to the distinctive symptoms than the shared one.13

13 Kruschke discusses how this tendency is so strong that it can account for the neglect of base rate information (the inverse base rate effect discovered by Medin and Edelson (1988)). In Medin and Edelson’s study, subjects were taught to classify diseases on the basis of symptoms. Subjects were trained to discriminate disease C (for common) and R (for rare), with C being presented three times as often as R. In the training, every instance of C had two symptoms, I and PC, and every instance of R had two

21

Page 22: Learning Argument Structure Generalizationsadele/papers/Papers/C.doc · Web viewAt the same time, when subjects were also provided a written list of co-occurring nouns (in alphabetical

It is clear that constructions are at least sometimes better predictors of overall meaning than many verbs. For example, when get appears in the VOL construction, it conveys caused motion, but when it appears in the VOO construction, it conveys transfer, and when it appears in the VL construction, it conveys intransitive motion:

(10) Pat got the ball over the fence.

get + VOL pattern “caused motion”

Pat got Bob a cake.

get + VOO pattern “transfer”

Pat got into the car.

get + VL pattern “intransitive motion”

Since most verbs appear in more than one construction with corresponding differences in interpretation, speakers would do well to learn to attend to the constructions. As an indication of the fact that the construction is at least as good a predictor of overall sentence meaning as the verb, we consider again the corpus use of the VOL construction.

3.5.1 Corpus evidence of the construction as a reliable predictor of overall sentence meaning

We reviewed all instances of the VOL construction in the database of mothers’ speech to 28 month old children in the Bates et al. (1988) corpus discussed above. Two independent coders classified utterances as either entailing caused motion or not; those that we judged not to entail caused motion were separated and further analyzed as discussed below. Disagreements were resolved through discussion. Agreement between the two independent coders was 99% initially and 100% after discussion (n = 250) for classifying mothers’ VOL utterances as entailing caused motion or not.

We found that 70% (176/250) of the mother’s instances of the construction clearly entail literal caused motion. The following examples are representative:

(11) get some more in it

bring ‘em back over here

stuff that all in your mouth

symptoms, I and PR. Since I appeared with both diseases, it was an imperfect predictor; PC was a perfect predictor of the common disease, PR was a perfect predictor of the rare disease.When tested with the ambiguous symptom I, subjects acted in accord with the disease’s overall frequency and chose the common disease, C. But when presented with conflicting symptoms PC and PR, people tended to choose the rare disease, contrary to what would be expected if base rates were taken into account in a simple way. Krushke (1996) suggests that subjects learn that the shared symptom I tends to be misleading for the less frequent disease

22

Page 23: Learning Argument Structure Generalizationsadele/papers/Papers/C.doc · Web viewAt the same time, when subjects were also provided a written list of co-occurring nouns (in alphabetical

put ‘em in the box

park the car over there

pick it up here

take ‘em in the house

Another 8% of instances involve the verbs keep, have, get or leave as in the following type of examples:

(12) keeping these people in the garage

does he have something on his head?

what’s she got on?

leave it right there

The utterances in (12) entail that the subject argument acts to keep or allow the theme argument to stay in a particular location. The subject argument is agentive and the locative phrase is predicated of the direct object argument just as in instances that entail prototypical caused motion. Many researchers have related these instances to cases of caused motion independently (e.g., Goldberg, 1995; Matsumoto, 1992; Talmy, 1976). If we include these cases in the final tally, 78% of VOL utterances imply caused motion or caused location. Another 5% of instances involve the verbs read or say which could be argued to encode metaphorical caused motion (Ackerman & Webelhuth, 1988; Goldberg, 1995; Reddy, 1979). Including these cases would raise the total number of VOL utterances whose meanings are related to caused motion to 83%.

Of the remaining VOL utterances, 20, or 8% involved locative adjuncts. We included these as instances of VOLs because we did not want to assume that the children had mastered the distinction between arguments and adjuncts. If we had excluded them, the total number of utterances included as VOL utterances would have been reduced from 250 to 230. The total percentage of VOL utterances that involved caused motion would have been 90%.

“Cue validity” is the conditional probability that an object is in a particular category, given that it has a particular feature or cue (Murphy, 1982). P(A|B) is the probablity of A, given B. The cue validity of VOL as a predictor of “caused-motion” meaning, or P(“caused-motion” | VOL), is somewhere between .70 and .90, depending on how inclusive we take the notion of “caused motion” to be, and how inclusively we define the VOL formal pattern.

The remaining 21 tokens include the following types of examples which do not entail literal motion:

(13) What is your foot doing on the table? (The WXDY construction: 17 instances)

23

Page 24: Learning Argument Structure Generalizationsadele/papers/Papers/C.doc · Web viewAt the same time, when subjects were also provided a written list of co-occurring nouns (in alphabetical

What did Ivy do to her arm? (1 instance)

find the bird in the snow (utterances with find: 2 instances)

get Papa at the airport (move-from interpretation: 1 instance)

Turning our attention to the verbs that occur in the VOL construction, in mothers’ speech in the Bates et al. (1988) corpus, as described in section 3.3.1, 40% of instances involved the verb put which is an excellent predictor of the overall sentence’s meaning. However, the next most frequent verbs in the VOL construction were do, accounting for 8.4% of the data, have (5.6%) and take (5.6%), and get (4.8%). These verbs are not reliably associated with the meaning they have in the VOL pattern insofar as each of these verbs occurs more frequently in a different construction, with a different meaning.

The verb do appears 21 times in the VOL construction. The majority of these cases (17) are instances of the special “What’s X doing Y?” (WXDY) construction. As described by Kay and Fillmore (1999), this construction is used to entail that there is something noncanonical about a situation. If we ask what the cue validity is of do as cue for this construction, we will see that it is extremely low. Do appears 551 times in the following constructions in the corpus:

21 VOL (13 in WXDY construction; 9 transitives with a locative adjunct)

5 intransitive (“act” interpretation: semantically intransitive)

159 transitive (“act on” interpretation: semantically transitive)

366 auxiliary (aspectual, polarity or grammatical meaning)

The probablity of the WXDY construction, given do, or the relevant cue validity is P(WXDY | do) = 17 / 551 = .003. Thus do is, in and of itself, not a reliable predictor of overall sentence meaning when it appears in the VOL frame.

While get appears 12 times in the VOL construction with a caused-motion interpretation (e.g., She got the toy into the box), the same morphological form occurs in a variety of different constructions with different meanings:

53 simple transitives (possession: get a washcloth)

15 resultative (change of state: get tired)

12 VOL construction (caused motion: get those over here)

9 transitive resultative (caused change state: get them undone)

8 VL construction (intrans. motion: get out of the car.)

7 VOO (transfer: get me some of those)

4 VP complement (modality: you got to chew your cookie up)

24

Page 25: Learning Argument Structure Generalizationsadele/papers/Papers/C.doc · Web viewAt the same time, when subjects were also provided a written list of co-occurring nouns (in alphabetical

1 NP VP complement (causation: get to take a bath)

Table 6. Occurrences of get in various constructions in the Bates et al. (1988) corpus

The cue validity of get as an indicator of caused motion is P (“caused motion” | get) = 12/109 = .11.

Take appears 14 times in the VOL construction with a meaning of caused motion, e.g., take their chairs in the house}. However, it occurs 30 times in the simple transitive construction. Twenty six of these transitive uses did not imply motion, but instead involved simple action scenes such as take a bite/a nap/some pictures; the other four did imply caused motion.

Have appears 14 times in the VOL construction, including 13 instances that entail location as in have something on his head. One instance of have in the VOL pattern was have a good time at the store. Across the speech of 15 mothers, have occurred 30 times with a VP complement conveying deontic modality as in now you have to wipe your mouth, 7 times as a perfect auxiliary, and 6 times in miscellaneous other patterns. The vast majority of instances of have (61) appeared in the simple transitive construction, conveying possession (can I have one of your apples) or action (you just had a nap). The cue validity of have as an indicator of caused location is 13/118, or .11.

Thus knowing only that do, get, take or have is present is not in itself a highly reliable predictor of the overall sentence meaning. The verbs put, do, get, take and have accounted collectively for 64% of the VOL tokens. To estimate the overall cue validity of verbs in the VOL construction, let us make the generous, if implausible, assumption that every single one of the remaining 36% of tokens involved a verb that was a perfect predictor of overall sentence meaning (cue validity = 1). We can then calculate the cue validity for verbs as the weighted average of the cue validity of each verb or:

.40 (1) + .084 (.003) + .056 (.11) + .056 (.41) + .048 (.11) + .36 (1) = .79

.40 + .0001 + .006 + .023 + .005 + .36 = .79(put) (do) (have) (take) (get) the rest

Given our dubious assumption that each of the unanalyzed remaining verbs had cue validity of 1, it is clear the cue validity of .79 for the verb predicting overall sentence meaning is an upper bound. If we compare the .70—.90 cue validity for the VOL pattern as a predictor of caused motion meaning, we can see the construction looks to be roughly as reliable or valid a cue. To better compare the verbs and constructions as predictors of overall sentence meaning, it is worth considering an experiment designed to test just this.

3.5.2 Experimental evidence of construction as reliable predictor of overall sentence meaning

Bencini & Goldberg (2000) conducted an experiment inspired by the Healy and Miller (1970) sorting experiment described above, which had been titled, “The Verb as the Main Determinant of Sentence Meaning.” In this earlier experiment, stimuli were created by crossing subject arguments with verbs, since it was assumed that the two best candidates

25

Page 26: Learning Argument Structure Generalizationsadele/papers/Papers/C.doc · Web viewAt the same time, when subjects were also provided a written list of co-occurring nouns (in alphabetical

for determining what the sentence was about were the verb and the subject argument. We aimed to compare the semantic contribution of the construction with that of the morphological form of the verb. The stimuli were sixteen sentences created by crossing four verbs with four different constructions:

1a. Pat threw the hammer (VO) Transitiveb. Chris threw Linda the pencil (VOO) Ditransitivec. Pat threw the key onto the roof. (VOL) Caused Motiond. Lyn threw the box apart. (VOR) Resultative

2a. Michelle got the book. (VO) Transitiveb. Beth got Liz an invitation (VOO) Ditransitivec. Laura got the ball into the net. (VOL) Caused Motiond. Dana got the mattress inflated. (VOR) Resultative

3a. Barbara sliced the bread. (VO) Transitiveb. Jennifer sliced Terry an apple (VOO) Ditransitivec. Meg sliced the ham onto the plate (VOL) Caused Motiond. Nancy sliced the tire open (VOR) Resultative

4a. Audrey took the watch. (VO) Transitiveb. Paula took Sue a message. (VOO) Ditransitivec. Kim took the rose into the house (VOL) Caused Motiond. Rachel took the wall down. (VOR) Resultative

Table 7. Stimuli for Sorting Experiment

Seventeen University of Illinois undergraduate students were asked to sort these sixteen sentences, provided in random order, into four piles based on “overall sentence meaning.” Subjects were instructed that there was no right or wrong answer, that the experiment was only intended to determine how people sorted sentences according to sentence meaning. Subjects could sort equally well by verb: e.g., all instances of throw (1a-d) being put into the same pile, regardless of construction; or subjects could sort by construction: all instances of e.g., the VOO or ditransitive construction (1a, 2a, 3a, and 4a) being put into the same pile.

It would of course be possible to design stimuli with a great deal of overlapping propositional content such that we could a priori predict either a verb or constructional sort. For example, the sentences Pat shot the duck and Pat shot the duck dead would very likely be grouped together on the basis of overall meaning despite the fact that the argument structure patterns are distinct. Conversely, Pat shot the elephant and Patricia stabbed a pachyderm would likely be grouped together despite the fact that no exact words were shared. The stimuli were designed to minimize such contentful overlap contributed by anything other than the lexical verb. No other lexical items in the stimuli were identical or near synonyms.

26

Page 27: Learning Argument Structure Generalizationsadele/papers/Papers/C.doc · Web viewAt the same time, when subjects were also provided a written list of co-occurring nouns (in alphabetical

The use of the sorting paradigm is a particularly stringent test to demonstrate the role of constructions. Medin et al. (1987) has shown that there is a strong, domain-independent bias towards sorting on the basis of a single dimension, even with categories that are designed to resist such one-dimensional sorts in favor of a sort based on a family resemblance structure (Rosch & Mervis, 1975). One-dimensional sorting has been found even with large numbers of dimensions (Smith, 1981), ternary values on each dimension (Anh & Medin, 1992), holistic stimuli and stimuli for which an obvious multidimensional descriptor was available (Regehr & Brooks, 1995). The stimuli presented subjects with an opportunity to sort according to a single dimension: the verb. Constructional sorts required subjects to note an abstract relational similarity that required the recognition that several grammatical functions co-occur. Thus we would expect verb sorts to have an inherent advantage over constructional sorts.

Six subjects produced entirely construction sorts, seven subjects produced entirely verb sorts, and four subjects provided mixed sorts. In order to include the mixed sorts in the analysis, results were analyzed according to how many changes would be required from the subject’s sort to either a sort entirely by verb (VS) or a sort entirely by construction (CS). The average number of changes required for the sort to be entirely by the verb was 5.5; the average number of changes required for the sort to be entirely by construction was 5.7. The difference between these scores does not approach significance. That is, subjects were just as likely to sort by construction as they were to sort according to the single dimension of the morphological form of the verb. If verbs provided equally good cues to overall sentence meaning, there would be no motivation to overcome the well-documented preference for one-dimensional sorts: subjects would have no motivation to sort by construction instead of by verb. Bencini and Goldberg hypothesize that constructional sorts were able to overcome the one-dimensional sorting bias to this extent because constructions are in fact better predictors of overall sentence meaning than the morphological form of the verb.

Bencini and Goldberg considered the possibility that our choice of verbs unduly influenced subjects’ sorts. It might be argued, for example, that while the verbs get and take may not contribute to overall sentence meaning as much as constructions, the semantically richer verbs slice and throw would. If this were so, we would expect to find that the latter verbs were put into the same piles in the sorts more often than the former verbs. However, an analysis of the number of separate piles that each verb was put into by the participants did not reveal any difference among the verbs either in the overall test (F < 1) nor in any of the single degree of freedom tests (all Fs < 1).

This experiment was performed with adults, but the implications for language learning are clear. Insofar as constructions are at least as good predictors of overall sentence meaning than any other word in the sentence, learners would do well to learn to identify construction types, since their goal is to understand sentences.

A question arises as to why constructions should perform at least as good as predictors of overall sentence meaning as verbs. The answer takes us back to the Gricean Maxim of Relevance and the fact that in context, knowing the number and type of arguments conveys a great deal about the scene being conveyed. To the extent that verbs encode rich semantic frames that can be related to a number of different basic scenes (Goldberg,

27

Page 28: Learning Argument Structure Generalizationsadele/papers/Papers/C.doc · Web viewAt the same time, when subjects were also provided a written list of co-occurring nouns (in alphabetical

1995), the complement configuration or construction will be as good a predictor of sentence meaning as the semantically richer, but more flexible verb.

3.5.3 Category ValidityWe have discussed cue validity, the probablity that an item belongs to a category, given that it has a particular feature: P(cat | feature), and we have found that when the category is taken to be overall sentence meaning, constructions have roughly equivalent cue validity compared with verbs. There is also a second relevant factor. Category validity is the probability that an item has a feature, given that the item belongs in the category: P(feature | cat). Thus category validity measures how common a feature is among members of a category. The relevant category is again, sentence meaning.

Given that there are hundreds of verbs that can be used to convey caused motion, but there are only one or two argument structure constructions that are used to convey caused motion (namely the VOL pattern and the simple transitive construction as in He brought his lunch), the constructions clearly have higher category validity than the verbs. That is, the category validity of the VOL pattern as a feature of the semantic category caused motion, is the probability that the VOL pattern will be involved, given the interpretation of caused-motion. Since expressions which convey caused motion are generally only of two types, VO and VOL, the category validity of VOL for the category of caused motion is relatively high.

On the other hand, the category validity of particular verbs as a feature of the semantic category caused motion can be calculated by taking the average of the sum of the probability of each verb that may be associated with caused motion being the actual verb involved when caused motion is expressed. Since so many different verbs can entail the notion of caused motion, this average is necessarily quite low. For example, what is the probability that a sentence with caused motion meaning will contain the verb throw? Clearly only a tiny fraction of expressions of caused motion contain this particular verb. Thus throw has low category validity for the category of caused motion. Since this is true for all verbs that may convey caused motion, it is also true of the average of all verbs that may convey caused motion. In this way the construction is available as a cue much more often than any individual verbs. This idea is expressed in Table 9:

P (VOL | “caused motion”) HighHigh

vs. n∑ P (verbi | “caused motion”)/n LowLowi=1 where verbi is a verb that may encode caused motion

Table 9. Category validity formuli for both form-based pattern and verb as features of the category of caused motion

All things being equal, if two cues have roughly equal validity, the higher category validity of one cue will naturally result in a greater reliance on that cue in categorization tasks (Bates & MacWhinney, 1987; Estes, 1986; Hintzman, 1986; Nosofsky, 1988).

28

Page 29: Learning Argument Structure Generalizationsadele/papers/Papers/C.doc · Web viewAt the same time, when subjects were also provided a written list of co-occurring nouns (in alphabetical

Thus constructions are better cues to sentence meaning than verbs insofar as they are as reliable (with equivalent cue validty) and much more available (having higher category validity).

4 ConclusionThis work suggests an account of both how and why children form the argument structure generalizations they do. The paper’s main theoretical contribution is to explicitly relate generally recognized facts about language acquisition to generally recognized facts about categorization. Many researchers in language acquisition allude to general categorization processes, but very little in the way of specifics have been offered. We argue that children initially generalize at the level of specific verbs plus argument slots (Tomasello’s “verb islands”) because the verb in an argument frame is an excellent predictor of overall sentence meaning. We argue further that children eventually generalize away from specific verbs to form more abstract argument structure constructions because the argument frame or construction has roughly equivalent cue validity as a predictor of overall sentence meaning than the morphological form of the verb, and has much greater category validity. That is, initially children attend to the complex of cues: that of verb and construction; eventually, attention focuses on the construction because the construction has equivalent cue validity and much greater category validity as compared with individual verbs, given the task of sentence comprehension. The construction is at least as reliable and much more available.

We further argue that the input is structured in such a way as to make the generalization from verb islands to argument structure constructions straightforward. The paper’s main empirical contribution is the finding that one particular verb accounts for the lion’s share of tokens of each argument frame considered in an extensive corpus study on the Bates et al. (1988) corpus, in both mothers’ and 28 month old children’s speech. The dominance of a single verb in the construction facilitates the association of the meaning of the verb in the construction with the construction itself, allowing learners to get a “fix” on the construction’s meaning. In this way, grammatical constructions may arise developmentally as generalizations over lexical items in particular patterns.

These proposals for how the semantics associated with constructions (linking rules) is learnable from the input directly undermines the “paucity of the stimulus” argument as it is aimed at this particular problem. Before we decide that language-specific properties must be innate, it is worth investigating how they might be learned, given general cognitive mechanisms such as categorization, together with a closer look at the input children receive.

29

Page 30: Learning Argument Structure Generalizationsadele/papers/Papers/C.doc · Web viewAt the same time, when subjects were also provided a written list of co-occurring nouns (in alphabetical

References

Ackerman, F., & Webelhuth, G. (1988). A theory of predicates: CSLI Publications.Akhtar, N. (1999). Acquiring word order: Evidence for data-driven learning of syntactic

structure. Journal of Child Language, 26(2), 339-356.Akhtar, N., & Tomasello, M. (1997). Young children's productivity with word order and

verb morphology. Developmental Psychology, 33(6), 952-965.Anderson, J. R. (1991). The adaptive nature of human categorization. Psychological

Review, 98(3), 409-429.Anh, W.-K., & Medin, D. L. (1992). A two-stage model of category construction.

Cognitive Science, 16(1), 81-121.Aronoff. (1976). Word Formation in Generative Grammar. Cambridge, Mass: MIT

Press.Avrahami, J., Kareev, Y., Bogot, Y., Caspi, R., Dunaevsky, S., & Lerner, S. (1997).

Teaching by examples: Implications for the process of category acquisition. The Quarterly Journal of Experimental Psychology, 50A(3), 586-606.

Baker, C. L. (1979). Syntactic theory and the projection problem. Linguistic Inquiry, 10(4), 533-581.

Bates, E., Bretherton, I., & Snyder, L. (1988). From First Words to Grammar: Individual Differences and Dissociable Mechanisms. New York: Cambridge University Press.

Bates, E., & Goodman, J. (1998). Grammar form the lexicon. In B. MacWhinney (Ed.), The Emergence of Language (pp. 197-212). Hillsdale, N. J.: Lawrence Erlbaum Associates.

Bates, E., & MacWhinney, B. (1987). Competition, variation, and language learning. In B. MacWhinney (Ed.), Mechanisms of Language Acquisition (pp. 157-193). Hillsdale, N. J.: Lawrence Erlbaum Associates.

Bencini, G., & Goldberg, A. (2000). The contribution of argument structure constructions to sentence meaning. Journal of Memory and Language, 43(4), 640-651.

Bowerman, M. (1973). Early Syntactic Development: A Cross-linguistic Study with Special Reference to Finnish. Cambridge: Cambridge University Press.

Bowerman, M. (1982). Reorganizational processes in lexical and syntactic development. In E. Wanner & L. R. Gleitman (Eds.), Language Acquisition: The State of the Art (pp. 319-346). New York: Cambridge University Press.

Bowerman, M. (1990). Mapping Thematic Roles onto Syntactic Functions: Are Children Helped by Innate Linking Rules? Linguistics, 28(6 (310), 1253-1289.

Braine, M.D.S. (1971). On two types o fmodels of the internalization of grammars. In Slobin, D.I. (ed.), The ontogenesis of grammar. New York: Academic Press.

Braine, M. D. S. (1976). Children's First Word Combinations (Vol. 4 (1)).Bresnan, J. (1982). The Mental Representation of Grammatical Relations. Cambridge,

Mass: MIT Press.

30

Page 31: Learning Argument Structure Generalizationsadele/papers/Papers/C.doc · Web viewAt the same time, when subjects were also provided a written list of co-occurring nouns (in alphabetical

Bresnan, J., & Kanerva, J. (1989). Locative inversion in Chichewa: A case study in the factorization of grammar. Linguistic Inquiry, 20(????), 1-50.

Brooks, P., & Tomasello, M. (1999). How children constrain their argument structure constructions. Language, 75(4), 720-738.

Brown, R. (1973). A First Language: The Early Stages. Cambridge, Mass: Harvard University Press.

Bruner, J. S., Goodnow, J. J., & Austin, G. A. (1956). A Study of Thinking. New York: Wiley.

Bybee, J. (1995). Regular Morphology and the Lexicon. Language and Cognitive Processes, 10(5), 425-455.

Bybee, J., Perkins, R., & Pagliuca, W. (1992). The Evolution of Grammar: Tense, Aspect, and Modality in the Languages of the World. Chicago: University of Chicago Press.

Bybee, J. L. (1985). Morphology: A Study of the Relation between Meaning and Form: John Benjamins Publishing Company.

Campbell, A. and M. Tomasello. (2001). The acquisition of English dative constructions. _Applied Linguistics_, 22, 253-267.

Carey, S., & Bartlett, E. (1978). Acquiring a single new word.Unpublished manuscript, Stanford: CA.

Cartwright, T. A., & Brent, M. R. (1997). Syntactic categorization in early language acquisition: Formalizing the role of distributional analysis. Cognition, 63(2), 121-170.

Choi, S. (1999). Early Development of Verb Structures and caregiver Input in Korean: Two case studies. 3(2-3), 241-265.

Chomsky, N. (1965). Aspects and the Theory of Syntax. Cambridge: MIT Press.Chomsky, N. (1980). The linguistic approach. In M. Piatelli-Palmarini (Ed.), Language

and Learning (pp. 109-130). Cambridge, Mass: Harvard University Press.Chomsky, N. (1988). Language and Problems of Knowledge: The Managua Lectures.

Cambridge, Mass: MIT Press.Clark, E. V. (1978). Discovering what words can do. Paper presented at the Papers from

the parasession on the lexicon.Clark, E. V. (1996). Early verbs, event types and inflections (Vol. 9). Mahwah, N. J.:

Lawrence Erlbaum Associates.Clark, E. V., & Clark, H. H. (1979). When nouns surface as verbs. Language, 55(4), 767-

811.Culicover, P. W. (1999). Syntactic Nuts: hard Cases, Syntactic Theory and Language

Acquisition reviewed by John R Taylor. Cognitive Linguistics, 10(3), 251-261.Davis, A. R. (1996). Lexical Semantics and Linking in the Hierarchical Lexicon.

Unpublished Ph.D Dissertation, Stanford University, Stanford: CA.De Villiers, J. G. (1985). Learning how to use verbs: Lexical coding and the influence of

the input. Journal of Child Language, 12(3), 587-595.Deacon, T. W. (1997). The Symbolic Species: The Co-Evolution of Language and the

Brain.Dennis, S., & Kruschke, J. K. (1998). Shifting attention in cued recall. Australian Journal

of Psychology, 50(3), 131-138.

31

Page 32: Learning Argument Structure Generalizationsadele/papers/Papers/C.doc · Web viewAt the same time, when subjects were also provided a written list of co-occurring nouns (in alphabetical

Dowty, D. (1991). Thematic proto-roles and argument selection. Language, 67(3), 547-619.

Elio, R., & Anderson, J. R. (1984). The effects of information order and learning mode on schema abstraction. Memory and Cognition, 12(1), 20-30.

Elman, J., Bates, E., Johnson, M., Karmiloff-Smith, A., Parisi, D., & Plunkett, K. (1996). Rethinking Innateness: A Connectionist Perspective on Development. Cambridge, Mass: MIT Press.

Estes, W. K. (1986). Array models for category learning. Cognitive Psychology, 18, 500-548.

Fillmore, C. J., Kay, P., Kathol, A., & Michaelis, L. (Unpublished Manuscript). Construction Grammar.Unpublished manuscript, University of California, Berkeley.

Fisher, C. (1996). Structural limits on verb mapping: The role of analogy in children's interpretations of sentences. Cognitive Psychology, 31(1), 41-81.

Fisher, C. (2000). From form to meaning: A role for structural alignment in the acquisition of language. Adv Child Developmental Behavior, 27, 1-53.

Fisher, C., Gleitman, H., & Gleitman, L. R. (1991). On the semantic content of subcategorization frames. Cognitive Psychology, 23(3), 331-392.

Fisher, C. (2000). The role of abstract syntactic knowledge in language acquisition: a reply to Tomsello (2000). Cognition 82: 259-278.

Ford, M. B. J. K. R. M. (1982). A Competence-Based Theory of Syntactic Closure. MIT Press. Cambridge.

Garnsey, S. M., Pearlmutter, N. J., Myers, E., & Lotocky, M. (1997). The contributions of verb bias and plausibility to the comprehension of temporarily ambiguous sentences. Journal of Memory and Language, 37(1), 58-93.

Gentner, D. (1982). Why nouns are learned before verbs: Linguistic relativity versus natural partitioning. In S. A. Kuzcaj II (Ed.), Language Development, Vol 2: Language, Thought, and Culture (Vol. 2, pp. 301-334). Hillsdale, N. J.: Lawrence Erlbaum Associates.

Gentner, D., & Medina, J. (1998). Similarity and the development of rules. Cognition, 65(2-3), 263-297.

Gillette, J., Gleitman, H., Gleitman, L. R., & Lederer, A. (1998). Human simulations of vocabulary learning: University of Pennsylvania.

Givón, T. (1971). Historical syntax and synchronic morphology: An archaeologist's field trip. Chicago Linguistic Society, 7, 384-415.

Gleitman, L. (1994). The Structural Sources of Verb Meanings. In P. Bloom (Ed.), Language Acquisition: core readings: MIT Press.

Goldberg, A. E. (1992). The inherent semantics of argument structure: The case of the English ditransitive construction. Cognitive Linguistics, 3(1), 37-74.

Goldberg, A. E. (1995). Constructions: A Construction Grammar approach to argument structure. Chicago: Chicago University Press.

Goldberg, A. E. (1996). Optimizing constraints and the Persian complex predicate. Berkely Linguistic Society, ????

Goldberg, A. E. (1998). The emergence of the semantics of argument structure constructions. In B. MacWhinney (Ed.), The Emergence of Language (pp. 197-212). Hillsdale, N. J.: Lawrence Erlbaum Associates.

32

Page 33: Learning Argument Structure Generalizationsadele/papers/Papers/C.doc · Web viewAt the same time, when subjects were also provided a written list of co-occurring nouns (in alphabetical

Grégoire, A. (1937). L'apprentissage du langage (Vol. I). Paris: Droz.Grice, H. P. (1975). Logic and Conversation. In P. Cole & J. Morgan (Eds.), Syntax and

Semantics 3: Speech Acts (pp. 41-58). New York: Academic Press.Grimshaw, J. (1981). Form, function and the language acquisition device. In C. L. Baker

& J. J. McCarthy (Eds.), The Logical Problem of Language Acquisition. Cambridge: MIT Press.

Grimshaw, J. (1990). Argument Structure. Cambridge, Mass: MIT Press.Gropen, J., Epstein, T., & Schumacher, L. (1997). Context-sensitive verb learning:

Children's ability to associate contextual information with the argument of a verb. Cognitive Linguistics, 8(2), 137-182.

Gropen, J., Pinker, S., Hollander, M., Goldberg, R., & Wilson, R. (1989). The learnability and acquisition of the dative alternation in English. Language, 65(2), 203-257.

Gross, M. (1975). Language.Healy, A. F., & Miller, G. A. (1970). The verb as the main determinant of sentence

meaning. Psychonomic Science, 20, 372.Heine, B. (1993). Auxiliaries: Cognitive forces and grammaticalization. New York:

Oxford University Press.Hintzman, D. L. (1986). "Schema abstraction" in a multiple-trace memory model.

Psychological Review, Vol 93(4), Oct 1986, 1411-1428 American Psychological Assn , US.

Hirsh-Pasek, K., Golinkoff, R. M., & Naigles, L. (1996). Young Children's Use of Syntactic Frames to Derive Meaning. In K. Hirsh-Pasek & R. M. Golinkoff (Eds.), The Origins of Grammar: Evidence from Early Language Comprehension (Vol. MIT Press). Cambridge, MA.

Holland, J. H., Holyoak, K. J., & Thagard, R. E. R. (1989). Induction: Processes of Inference, Learning and Discovery. Cambridge, Mass: MIT Press.

Homa, D., Dunbar, S., & Nohre, L. (1991). Instance frequency, categorization, and the modulating effect of experience. Journal of Experimental Psychology: Learning, Memory, & Cognition, Vol 17(3), May 1991, 1444-1458 American Psychological Assn, US, http://www apa org.

Hooper, P. J., & Thompson, S. A. (1980). Transitivity in Grammar and Discourse. Language, 56(2), 251-299.

Horn, L. R. (1984). Towards a new taxonomy for pragmatic inference: Q-based and R-based implicature. In D. Schiffrin (Ed.), GURT '84: Meaning form and use in context. Washington: Georgetown University Press.

Hovav, & Levin. (1998). Building Verb Meanings. In M. Butt & W. Geuder (Eds.), The projection of Arguments: Lexical and Compositional Factors. Stanford: CA: CSLI Publications.

Ingram, D., & Thompson, W. (1996). Early syntactic acquisition in German: Evidence for the Modal Hypothesis. Language, 72(1), 97-120.

Jackendoff, R. (1983). Semantics and Cognition. Cambridge, Mass: MIT Press.Jackendoff, R. (to appear).Janda, R. D. (1990). Frequency, markedness and morphological change: On predicting

the spread of noun-plural -s in Modern High German and West Germanic. ESCOL, 7, 136-153.

33

Page 34: Learning Argument Structure Generalizationsadele/papers/Papers/C.doc · Web viewAt the same time, when subjects were also provided a written list of co-occurring nouns (in alphabetical

Kay, P., & Fillmore, C. J. (1999). Grammatical constructions and linguistic generalizations: The What's X doing Y? construction. Language, 75(1), 1-34.

Kersten, A. W., & Billman, D. (1997). Event category learning. Journal of Experimental Psychology: Learning, Memory and Cognition, 23(3), 638-658.

Kruschke, J. K. (1996). Base rates in category learning. Journal of Experimental Psychology: Learning, Memory and Cognition, 22(1), 3-26.

Kuno, S. (1972). Functional sentence perspective: A case study from Japanese and English. Linguistic Inquiry, 3(3), 269-320.

Lambrecht, K. (1994). Information Structure and Sentence Form: Cambridge University Press.

Landau, B., & Gleitman, L. R. (1985). Language and Experience: Evidence from a Blind Child. Cambridge: Mass: Harvard University Press.

Leake, D. B., & Ram, A. (1995). Learning, goals, and learning goals: A perspective on goal-driven learning. Artificial Intelligence Review, 9(6), 387-422.

Levin, B. (1993). English verb Classes and Alternations. Chicago: University of Chicago Press.

Lieven, E. V. M., Pine, J. M., & Baldwin, G. (1997). Lexically-based learning and early grammatical development. Journal of Child Language, 24(1), 187-219.

MacDonald, M. C., Pearlmutter, N. J., & Seidenberg, M. S. (1993). The Lexical Nature of Syntactic Ambiguity Resolution.Unpublished manuscript, USC.

MacWhinney, B. (1978). The Acquisition of Morphophonology (Vol. 43). Chicago: University of Chicago Press.

MacWhinney, B. (1982). Basic syntactic processes. In S. A. K. II (Ed.), Language Development (Vol. Syntax and Semantics Vol 1). New Jersey: Lawrence Erlbaum Associates.

MacWhinney, B. (1995). The CHILDES Project: Tools for analyzing talk (Second ed.). Hillsdale, N. J.: Lawrence Erlbaum Associates.

Maddox, W. T. (1995). Base-Rate Effects in Multidimensional Perceptual Categorization. Journal of Experimental Psychology: Learning, Memory and Cognition, 21(2), 288-301.

Markman, A. B., & Gentner, D. (1993). Structural alignment during similarity comparisons. Cognitive Psychology, 25(4), 431-467.

Matsumoto, Y. (1992). Abstract Motion and English and Japanese Verbs.Unpublished manuscript, Tokyo Christian University.

Medin, D. L., & Edelson, S. M. (1988). Problem structure and the use of base-rate information from experience. Journal of Experimental Psychology: General, 117(1), 68-85.

Medin, D. L., Wattenmaker, W. D., & Hampson, S. E. (1987). Family resemblance conceptual cohesiveness, and category construction. Cognitive Psychology, 12(2), 242-279.

Morris, W. C., & Cottrell, G., W. (1999). The empirical acquisition of grammatical relations. Paper presented at the Proceedings of the Cognitive Science Society Meeting.

Murphy, G. L. (1982). Cue validity and levels of categoryization. Psychological Bulletin, 91(1), 174-177.

34

Page 35: Learning Argument Structure Generalizationsadele/papers/Papers/C.doc · Web viewAt the same time, when subjects were also provided a written list of co-occurring nouns (in alphabetical

Naigles, L. (1990). Children use syntax to learn verb meanings. Journal of Child Language, 357-374.

Naigles, L. R., & Hoff-Ginsberg, E. (1998). Why are some verbs learned before other verbs? Effects of input frequency and structure on children's early verb use. JCL, 25, 95-120.

Newmeyer, F. (1998). Language Form and Language Function. Cambridge: Mass: MIT Press.

Ninio, A. (1996, July 1996). Pathbreaking verbs in semantic development. Paper presented at the presented at the 7th Internatioanl Congress for Child Language, Istanbul, Turkey.

Ninio, A. (1999). Pathbreaking verbs in syntactic development and the question of prototypical transitivity. JCL, 26(619-653).

Nosofsky, R. (1988). Similarity, frequency and category representations. Journal of Experimental Psychology: Learning, Memory and Cognition, 14(1), 54-65.

Olguin, R., & Tomasello, M. (1993). Twenty-five-month-old children do not have a grammatical category of verb. Cognitive Development, 8(3), 245-272.

Park, T.-Z. (1977). Emerging language in Korean children.Unpublished manuscript, Bern: Institute of Psychology.

Peterson, R.R., Burgess, C., Dell, G.S., & Eberhard, K.M. (2001) Dissociation between syntactic and semantic processing during idiom comprehension. Journal of Experimental Psychology: Learning, Memory, and Cognition, 27, 1223-1237.

Pinker, S. (1981). Comments on the paper by Wexler. In C. L. Baker & J. J. McCarthy (Eds.), The Logical Problem of Language Acquisition. Cambridge: Mass: Harvard University Press.

Pinker, S. (1984). Language Learnability and Language Development: Harvard University Press.

Pinker, S. (1989). Learnability and Cognition: The Acquisition of Argument Structure. Cambridge, Mass: MIT Press/Bradford Books.

Pinker, S. (1994). The Language Instinct. New York: William Morrow and Co.Plunkett, K., & Marchman, V. (1991). U-Shaped learning and frequency-effects in a

multilayerd perceptron: Implications for child language acquisition. Cognition, 38(1), 43-102.

Plunkett, K., & Marchman, V. (1993). From rote learning to system building: Acquiring verb morphology in children and connectionist nets. Cognition, 48(1), 21-69.

Pollard, C., & Sag, I. (1987). Information-based syntax and semantics. Stanford, CA: CSLI Publications.

Posner, M. I., Goldsmith, R., & Welton, K. E. J. (1967). Perceived distance and the classification of distorted patterns. Journal of Experimental Psychology: General, 73(1), 28-38.

Posner, M. I., & Keele, S. W. (1968). On the genesis of abstract ideas. Journal of Experimental Psychology, 77(3.1), 353-363.

Reddy, M. (1979). The conduit metaphor. In A. Ortony (Ed.), Metaphor and thought (pp. 284-324). Cambridge: Cambridge University Press.

Regehr, G., & Brooks, L. R. (1995). Category organization in free classification: The organizing effect of an array of stimuli. Journal of Experimental Psychology: Learning, Memory and Cognition, 21(2), 347-363.

35

Page 36: Learning Argument Structure Generalizationsadele/papers/Papers/C.doc · Web viewAt the same time, when subjects were also provided a written list of co-occurring nouns (in alphabetical

Reinhart, T. (1982). Pragmatics and linguistics: An analysis of sentence topics. Bloomington, Indiana: Indiana University Linguistics Club.

Rosch, E., & Mervis, C. B. (1975). Family Resemblances: Studies in the internal structure of categories. Cognitive Psychology, 7(573-605).

Ross, B. H., & Makin, V. S. (1999). Prototype versus exemplar models, The Concept of Cognition. Cambridge, Mass: MIT Press.

Sampson, G. (1997). Educating Eve: The 'Language Instinct' Debate. New York: Cassell.Sanchés, M. (1978). On the Emergence of Multi-Element Utterances in the Child's

Japanese.Unpublished manuscript, Austin, Texas.Schlesinger, I. M. (1982). Steps to Language: Toward a Theory of Language Acquisition.

Hillsdale, N. J.: Lawrence Erlbaum Associates.Seger, C. A. (1994). Implicit learning. Psychological Bulletin, 115(2), 163-196.Sethuraman, N., Goldberg, A. E., & Goodman, J. (1997). Using the semantics associated

with syntactic frames for interpretation without the aid of non-linguistic context. Paper presented at the Twenty-seventh Annual Child Language Research Forum.

Silverstein, M. (1976). Hierarchy of features and ergativity. In R. Dixon (Ed.), Grammatical categories in Australian languages. New York: Humanities Press.

Slobin, D. I. (1985). Crosslinguistic evidence for the language-making capacity. In D. I. Slobin (Ed.), A Crosslinguistic Study of Language Acquisition (Vol. 2: Theoretical Issues). Hillsdale, N. J.: Lawrence Erlbaum Associates.

Smith, L. B. (1981). Importance of overall similarity of objects for adults' and children's classifications. Journal of Experimental Psychology: Human Perception and Performance, 7(4), 811-824.

Smith, L. B., Jones, S. S., & Landau, B. (1996). Naming in young children: A dumb attentional mechanism? Cognition, 60(2), 143-171.

Song, J. J. (1996). Causatives and Causation. New York: Addison, Wesley, Longman.Talmy, L. (Ed.). (1976). Semantic Causative Types (Vol. 6). New York: Academic Press.Tomasello, M. (1992). First verbs: A Case Study of Early Grammatical Development.

Cambridge: Cambridge University Press.Tomasello, M. (2000). Do young children have adult syntactic competence? Cognition,

74(3), 209-253.Tomasello, M. and Kruger, A. (1992). Join attention on actions: Acquiring verbs in

ostensive and non-ostensive contexts. Journal of Child Language 19, 311-333.Tomasello, M., Aktar, N., Dodson, K., & Rekau, L. (1997). Differential productivity in

young children's use of nouns and verbs. Journal of Child Language, 24(2), 373-387.

Trueswell, J. C., Tanenhaus, M. K., & Kello, C. (1993). Verb specific constraints in sentence processing: Separating effects of lexical preference from garden-paths. Journal of Experimental Psychology: Learning, Memory and Cognition, 19(3), 528-553.

Windfuhr, G. (1979). Persian Grammar. New York: Mouton.Wisniewski, E. J. (1995). Prior knowledge and functionally relevant features in concept

learning. Memory and Cognition, 21(2), 449-468.Wolfenden, S. N. (1929). Outlines of Tibeto-Burman linguistic morphology. London:

Royal Asiatic Society.Zipf, G. K. (1935). The Psycho-Biology of Language. Boston: Houghton Mifflin.

36