LOOKAHEAD IN PHONOLOGICAL PROCESSING ...ling.umd.edu/~ellenlau/papers/LOOKAHEAD_PHON_PROC.doc ·...

23

Click here to load reader

Transcript of LOOKAHEAD IN PHONOLOGICAL PROCESSING ...ling.umd.edu/~ellenlau/papers/LOOKAHEAD_PHON_PROC.doc ·...

Page 1: LOOKAHEAD IN PHONOLOGICAL PROCESSING ...ling.umd.edu/~ellenlau/papers/LOOKAHEAD_PHON_PROC.doc · Web viewPast this next word, however, while there are a lot of cases that have seemed

LOOKAHEAD IN PHONOLOGICAL PROCESSING

Ellen F. LauJanuary 2006

LING620

INTRODUCTION

The extent to which language production is incremental—at any level—is still a pretty contentious issue. At the same time, a lot of people will concede a fair amount of non-incrementality at the message-level, at least. We tend to feel like you figure out at least a main predicate holistically when you plan your message.

Let’s grant this intuition. In this paper, then, the question will not be whether or not some part of the linearly subsequent utterance has been planned already. Instead, I want to examine evidence bearing on two related questions: 1) can the wordform production process at the current word “lookahead” to that subsequent information? and 2) can any part of the subsequent “pre-planned” information be classified as wordform information? By wordform, we just mean any information to do with how a word is acoustically realized, whether it is phonetic, phonological, prosodic or whatever.

These two questions have critical implications for our computational models of phonological processes (to the extent that these models are meant to capture language-knowledge-in-use). The answer to the first question can tell us how much lookahead is reasonable to allow in our models—important to know because lookahead is computationally costly and is therefore somewhat aesthetically dispreferred, but it’s also very powerful. The answer to the second question can, among other things, give us information on how many “cycles” or “rescans” are reasonable to include in the model. For example, if default stresses for the whole sentence are specified in one stage, and thus can be input to a stage closer to phonological spellout where stress repair can take place, we have more support for modeling stress shift as a second machine which rescans text.

In the following, we’ll first take a look at the most well-known processing models of production, those of Levelt and colleagues, to get a sense of the processing stages and levels of representation required for production. Then we will review the evidence for lookahead in production of the current “phonological” wordform.

STANDARD PSYCHOLINGUISTIC MODELS OF PRODUCTION

Levelt, Roelofs, and Meyer (1999) is currently the most widely-used and well-worked out model of single-word production; the implemented model is known as WEAVER++. The model is described in terms of staged processes, going from conceptual preparation to lexical selection (more precisely, lemma selection, where lemma is a package of lexical-syntactic information), and then to the processes that encompass word form generation, the focus of our inquiry. In their model, four steps are involved in wordform

Page 2: LOOKAHEAD IN PHONOLOGICAL PROCESSING ...ling.umd.edu/~ellenlau/papers/LOOKAHEAD_PHON_PROC.doc · Web viewPast this next word, however, while there are a lot of cases that have seemed

generation: morphological access (e.g. from a lemma <<run + PROGRESSIVE>> retrieve morphemes <run> and <-ing>), phonological ‘spell-out’ (retrieving and encoding all the metrical and segmental information for the lemma simultaneously, labeled with positional information), syllabification/prosodification (incremental left-to-right generation of the phonological word, including imposing syllable structure), and phonetic encoding (computing an articulatory/gestural ‘score’). The whole process finishes with articulation in the motor system.

Figure 1. Flow diagram of Levelt, Roelofs & Meyer, 1999.

The model is serial stage, feedforward, but they allow overlap between the stages, in the sense that once some initial conceptual preparation or lemma selection is done, for example, the system can start working on the phonological encoding for that lemma and at the same time the conceptual preparation system can start working on the next word. This allows for strong incrementality in the model, as the conceptual framework, for example, is being built online only slightly ahead of the phonological encoding.

For the most part, the terminology and hypothesized subprocesses of LRM 1999 will be adopted for this paper. The one assumption that will be seriously questioned, following Keating and Shattuck-Hufnagel (2002), is that morphophonological encoding is strongly

Page 3: LOOKAHEAD IN PHONOLOGICAL PROCESSING ...ling.umd.edu/~ellenlau/papers/LOOKAHEAD_PHON_PROC.doc · Web viewPast this next word, however, while there are a lot of cases that have seemed

left-to-right incremental in nature, an assumption dating back to studies by Meyer (1990, 1991) showing reduced onset latencies for first-syllable but not second-syllable priming (lotus-loner, but not murder-boulder). Although we don’t dispute this finding per se, we question whether the extent of incrementality implicitly assumed in the model may have been too far-reaching.

LRM (1999) attempt to confine their discussion to single-word production, although as we will see later, they have to deal a little bit with word+1 in their description of syllabification. We have to go back to Levelt (1989) for a view on incrementality in multi-word utterances. In this model, word-level encoding is done first, then a prosody generator creates a metrical grid, and then segmental information is adjusted. Crucially, Levelt argues that this can largely be done incrementally word-by-word, although he allows the possibility of one-word lookahead to capture phenomena like the Beat Movement discussed below.

As discussed earlier, the models of Levelt and colleague try to make the speech production process as incremental as possible. This implies a minimal amount of lookahead. But it’s important to consider that we can talk about two different kinds of lookahead: lookahead with respect to the same level of representation (e.g., phonological encoding of word0 is influenced by the phonological content of word2) and lookahead with respect to other levels of representation (e.g., phonological encoding of word0 is influenced by the conceptual content of word2). These two types of lookahead differ significantly in their implications for computational load and in what kind of data would serve as evidence for them. In the next two sections we deal with each in turn.

LOOKAHEAD WITH RESPECT TO HIGHER LEVELS OF REPRESENTATION

If encoding of higher levels of representation sequentially precedes encoding of lower levels for each given piece of utterance (although they overlap), the higher levels may always have ‘an edge’ on the lower levels such that a little bit more has been done on the higher level than the lower level at any given time. Since in LRM 1999 you do in fact have these kind of sequentially overlapping processing levels in production, it would seem straightforward to expect this kind of lookahead to be possible, although it is not explicitly said to be necessary.

Within LRM 1999’s syllabification/prosodification stage, some amount of lookahead to higher levels seems to be required. Note that a key feature of this stage is that the domain of syllabification is thought to be the ‘phonological word’ (aka ‘prosodic word’)—this allows you to capture syllabification across lexical boundaries like escor-tus for “escort us” (also supported by experimental work of Wheeldon & Lahiri 1997, among others). This means that the phonological word boundary must be defined before this prosodification process can begin, and it’s a little tricky to see how that will work; even if cases of multi-word phonological words are limited to syntactically-signalled cases like clitics and affixes, as they suggest, it means that the syntax for the next piece must be done before the post-phonological spell-out step of syllabification can be done—or a fast, true resyllabification will need to be done after that syntax is encountered. In fact, Levelt

Page 4: LOOKAHEAD IN PHONOLOGICAL PROCESSING ...ling.umd.edu/~ellenlau/papers/LOOKAHEAD_PHON_PROC.doc · Web viewPast this next word, however, while there are a lot of cases that have seemed

(1989) suggests that metrical information may also be used to determine phonological words (p. 371)—if this were true, it would come closer to arguing for within-level lookahead. Levelt (1989) also mentions the pair I know where it’s located / *I know where it’s (due to Pullum & Zwicky 1988), which suggests that the possibility of cliticization may depend on lookahead past the potential clitic itself. It’s not clear what the nature of the constraint is, whether syntactic, prosodic, or phonological, so not clear what level the lookahead would need to be at to capture this contrast.

Ferreira (1993) presents another type of evidence for lookahead to a higher level. In this paper, Ferreira shows evidence that at least parts of the prosodic timing contour of the sentence are created without reference to the segmental content. A short-vowel word like black has an intrinsically shorter duration than a long-vowel word like green. However, when the duration of the word is added to the duration of the following pause, Ferreira found that the sum was equal for the short- and long-vowel words. In other words, the timing ‘slot’ allotted to the word was always the same, and the following pause was used to fill up the remaining space in the case of a shorter word. This relationship held whether the words were phrase-medial and phrase-final, even though the overall sum of word+pause increased overall for the phrase-final case due to the phenomenon of phrase-final lengthening. Although this result does not directly tell us whether the creation of the prosodic slot is temporally prior to the insertion of segmental information, it provides us a case in which such temporal precedence is at least quite feasible, since the timing of the slots is independent of segmental information. We might then easily imagine a case where the phonological instantiation of word0 was influenced by the prosodic timing information about a word slot several words later, without putting undue computational burden on the producer. This kind of prosody-first architecture is in fact what Keating & Shattuck-Hufnagel (2002) end up arguing for at the end of their article.

Another piece of evidence for some amount of lookahead beyond the current word in phonological production comes from recent computational linguistic analyses of the effects of predictability on phonological reduction (Bell, Jurafsky, Fosler-Lussier, Girand, Gregory, & Gildea, 2003). In general, these analyses focus on this predictability effect, which involves only lookback (to the w-1) rather than lookahead. However, Bell and colleagues show that not only does the conditional probability of the current word given the previous word matter for the reduction effect, the conditional probability of the current word given the following word also leads to shortening effects. If this effect is real, then the production mechanism must have access to the word after at the time it is phonologically encoding the word before. However, the factor of predictability could reasonably thought to be represented at a higher level of representation than the wordform—for example the conceptual level or the lemma level.

Above I have given a sampling of evidence for looking “up and over” in production, but I think there is probably much more. Most models, including LRM 1999 and Levelt 1989, assume that the higher level stages of production “lead” the lower levels to some degree; my sense is that the debate is much more about “how”, “by how much”, and “which levels affect which” than “whether”. The question of “how much” with respect to lookahead is further complicated by the necessity for converting between different units at different levels. On the other hand, the issue of whether you have a significant amount

Page 5: LOOKAHEAD IN PHONOLOGICAL PROCESSING ...ling.umd.edu/~ellenlau/papers/LOOKAHEAD_PHON_PROC.doc · Web viewPast this next word, however, while there are a lot of cases that have seemed

of lookahead within a level of representation—the size of the workspace—seems both more clear-cut and more contentious, and thus this is the question we will focus on for the remainder of the paper.

LOOKAHEAD WITHIN A LEVEL OF REPRESENTATION

The other kind of lookahead, lookahead with respect to the same level, is more computationally problematic. A model that incorporates this kind of lookahead must have more than one object available to its ‘workspace’ at a time, and it must be working on the representational spell-out of several objects simultaneously. My sense is that it is this kind of lookahead that Levelt would most like to avoid. Keating & Shattuck-Hufnagel have done a great job of enumerating some of the key findings that would appear to suggest that this kind of within-level lookahead does happen. Below I briefly describe these findings, organized by the amount of lookahead they indicate, and assess how successful each is as an argument for lookahead.

Lookahead: Syllable+1

SyllabificationIn LRM (1999), syllabification is seen as a late stage of the phonological spell-out step, as it depends on the phonological environment (e-scort but e-scor-ting), and it is thought to be done in a strongly left-to-right incremental fashion; retrieve the segment, associate it with a syllable position, and move to the next. Some minimal amount of lookahead—they suggest to the next vowel—is needed to get things right here. The example they use is that a consonant that would be assigned as onset by default must be assigned as coda if at the end of the word, but presumably this would arise in other cases as well, e.g. to obey Maximum Onset and sonority constraints. However, this amount of lookahead could be thought to be less problematic than most because the theory already says that the segmental information for the whole phonological word should be available in the workspace by the syllabification stage; thus, assuming lookahead to some later parts of the segmental information when working on syllabifying earlier parts does not require adding any pieces to the buffer that weren’t already there.

Priming evidenceMeyer & Schriefers (1991) showed that explicit first- or second- syllable primes yield facilitation for production of a disyllabic word. In contrast, implicit priming studies by Meyer (1990; 1991) show no facilitation in response latency for within-list priming of the second syllable, although she does find facilitation for priming of the first syllable (loner-local vs murder-boulder); this was the original motivation for the right-to-left incrementality in syllabification. LRM (1999) explain this discrepancy by saying that explicit priming speeds up the non-incremental phonological encoding stage, while implicit priming speeds up the incremental syllabification stage. It’s not completely clear to me why implicit priming would be predicted to affect one stage and not the other, but

Page 6: LOOKAHEAD IN PHONOLOGICAL PROCESSING ...ling.umd.edu/~ellenlau/papers/LOOKAHEAD_PHON_PROC.doc · Web viewPast this next word, however, while there are a lot of cases that have seemed

since what we care about here is the possibility of lookahead, and since LRM (1999) seem happy to allow lookahead up to the vowel of the next syllable, we can leave open the question of how much evidence there is to think that the syllabification is still really in some sense “incremental”.

Lookahead: Phonological Word+1

Beat MovementBeat movement (aka Rhythm Rule or Early Accent) is a phenomenon which has been argued to exist in many languages as a means of removing stress clashes between the main word stress of adjacent words (e.g. Selkirk 1984, Liberman 1975, Hayes 1984). A common example is a word like JapaNEse, with highest prominence on the last syllable; when combined with a word like INstitute, with highest prominence on the first syllable, the main stress in Japanese seems to ‘move’ from the last syllable to the first, arguably to avoid clashing with the early stress in the second word, JApanese INstitute. In order to know which syllable of Japanese to stress, the system must look at the inherent stress pattern of the upcoming word.

Eurhythmy and LapsesRelated to the beat movement cases which are designed to avoid stress clashes, several authors have postulated that a broader preference for eurhythmy, or even rhythmic alternation of strong-weak, causes producers to avoid long strings of weak syllables, or lapses, as well (Hayes 1984, Nespor & Vogel 1989).

Hayes (1984) argues for constraints of eurhythmy which are independent of, but interact with, the Rhythm Rule discussed above. For example, his Quadrisyllabic Rule states that the ideally eurhythmic grid will contain a row whose marks are spaced four syllables apart (or alternately, four time units apart), and that Beat Movement (in cases of clash) and Beat Addition (in cases of lapse) should take place only when they will make the stress pattern closer to this ideal. Nespor and Vogel (1989), on the other hand, argue specifically for a more local analysis based on beat deletion, beat addition, and beat insertion (instantiated as either first-syllable lengthening or pause). Thus, although a long-distance lookahead effect of clash or lapse could be seen in their system, if lookahead was built into the system such that the effect of future beat deletions could be incorporated into the metrical instantiation of the current syllable, it does not follow from their system the way it would follow from a Beat Movement system. The only lookahead required for their rules is to the stress of the next syllable.

Although, as discussed below, there is little experimental evidence for rule-driven stress shift as a reaction to violations of eurhythmy, there are some hints that people are nevertheless sensitive to it. Cutler (1980) showed that speech errors like deleting a syllable are more common in positions where eurhythmy is violated, and V. Ferreira and colleagues (p.c.) obtained some preliminary evidence that the use of optional complementizer that was dependent on whether it’s insertion would improve or worsen the eurhythmy of the sentence. Further work is still needed before this phenomenon can be taken as a strong argument for non-incrementality.

Page 7: LOOKAHEAD IN PHONOLOGICAL PROCESSING ...ling.umd.edu/~ellenlau/papers/LOOKAHEAD_PHON_PROC.doc · Web viewPast this next word, however, while there are a lot of cases that have seemed

Speech ErrorsThe speech error literature is filled with examples in which segments of adjacent words are switched, as in flow snurries (Dell 1986). For this kind of phonological switch error to happen, parts of the phonology of both words must have been available during the encoding of the first word. On the other hand, it could be argued that this is actually the cause of the error; in other words, that it is only when the phonology of two words is erroneously available at the same time that errors can happen, and thus in normal production it is ecologically useful for the word+1 not to be active. Thus, the speech error evidence is suggestive with respect to lookahead, but has multiple interpretations.

Picture-naming evidenceCosta and Caramazza (2002) present what is perhaps the most unambiguous evidence for some impact of the phonology of word+1 on production. In two experiments, Costa and Caramazza looked at the latency to begin production of phrases like the car and the red car in a picture-naming task in which a distractor word was superimposed on the picture to be named. They replicated previous findings in showing that production latency for phrases like the car were facilitated by a distractor phonologically related to the noun, even though the noun was not the first lexical item in the utterance (this kind of finding was part of the original motivation for tying phonological production to the phonological word rather than the lemma). More importantly, however, they showed the same facilitation in production latency for distractors phonologically related to the noun in production of determiner-adjective-noun sequences like the red car. This suggests that planning for the phonology of the noun is affecting latency to start producing the preceding phonological word. In a third experiment, they show that the same effect holds in production of noun phrases in Spanish, which has determiner-noun-adjective order; here, priming of the phonology of the phonological word+1 adjective can affect the latency of the onset, suggesting that this planning effect is not specific to the head of a phrase. Roelofs (1998) also found results consistent with word+1 latency facilitation in a paradigm using across-list implicit priming (also see Miozzo & Caramazza, 1999; Alario & Caramazza, 2002). This work gives us fairly good reason to think that the segmental information for the next phonological word could be somewhat available for lookahead. On the other hand, Meyer (1996) failed to find such effects for the phonological word+2 or +3 (e.g. the arrow is next to the bag), demonstrating that if the phonology further ahead is available in some way, it is at least not active enough to engender these kinds of distractor-priming effects.

Lookahead: Phonological Word+x

Beat MovementIn more extreme cases of beat movement, it is argued that the system must look several words ahead to find the correct stress, as in sixTEEN vs sixTEEN JapaNEse vs SIXteen JApanese INstitutes, or ANtique NINEteen TWENty-seven MOtorcars (Keating & Shattuck-Hufnagel use the example MISsissippi LEgislators’ ANtique NINEteen TWENty-seven MOtorcars, but since primary stress for legislator is typically on the first syllable, the first two words compose a separate dependency). What is important to note

Page 8: LOOKAHEAD IN PHONOLOGICAL PROCESSING ...ling.umd.edu/~ellenlau/papers/LOOKAHEAD_PHON_PROC.doc · Web viewPast this next word, however, while there are a lot of cases that have seemed

is that, in contrast to resolving stress clash by stress deletion of one of the syllables involved in the clash, beat movement requires that a syllable further back from the syllables involved in the clash be affected (the stress unit needs to ‘move’ somewhere), so it requires a considerable amount of lookahead. Levelt (1989) accepts these cases, but as he wants to limit within-level lookahead to one word in general, he suggests that Beat movement requiring more than one word of lookahead is limited in practice to more formal speaking situations that intrinsically involve more planning.

Although I think these are very interesting cases, I believe that the stress data is more complicated than has been often characterized in summaries of the literature. In many cases, it is unclear to me whether the stress movement is actually a response to stress clash, or rather something to do with information structure or syntactic structure. For example, people also has main stress on the first syllable, but I find JapaNESE PEOple to sound quite as good as JApanese PEOple, with perhaps some slight difference in focus structure from one to the other. Similarly, I find ANtique harMONica to be as good or better than anTIQUE harMONica, even though the stress clash isn’t forced here since harmonica has second syllable main stress. Selkirk (1984) contrasts TenneSEE with TENnessee AIR, but my intuition is that you actually get something more like Tennessee AIR, in accordance with a deletion analysis of Beat Movement (e.g. Horne 1990, Nespor & Vogel 1989) where stress doesn’t actually move due to stress clash, it is simply deleted.

It’s also hard to know how much of the difference from one case to another is due to the utterance-level prosody differences—for example, the reason the last syllable of TenneSEE is so much stronger in isolation than in a compound might be that it is getting the benefit of utterance-final-lengthening in the former. Relatedly, Selkirk points out that the syntactic/semantic structure assigned to a string like Chinese expert seems to affect the preferred stress: an expert on Chinese seems to be stressed as ChiNESE EXpert, while an expert who is Chinese seems to be stressed as CHInese EXpert.

The few studies that have examined at the empirical examined more rigorously also do not provide especially strong support for Beat movement as described in the literature. Cooper and Eady (1986) had speakers read sentences like Thirteen corporations/companies submitted bids... According to the predictions of Beat movement, the stress pattern of companies should induce stress shift from thirTEEN to THIRteen. However, careful analysis of five experiments showed no evidence for this. Kelly and Bock (1988) did show a small but measurable stress shift due to context, but only on nonwords. Since nonwords don’t have any lexically specified stress pattern, this seems less of an example of stress shift than stress choice; it shows that in the absence of any other constraint, people prefer to create a rhythmically even string, but not that this numerically weak preference could ever override lexically specified stress.

For myself, then, before I would feel comfortable taking the Beat movement cases as strong evidence for the necessity of multiword lookahead within the stress level, I would like to see more corpus and experimental evidence. Most of the literature compares a word’s stress in isolation with a word’s stress within a particular context of stress, but what I would really like to see is a comparison of a single word in the exact same position

Page 9: LOOKAHEAD IN PHONOLOGICAL PROCESSING ...ling.umd.edu/~ellenlau/papers/LOOKAHEAD_PHON_PROC.doc · Web viewPast this next word, however, while there are a lot of cases that have seemed

in the syntax and information structure, but with words of different stress patterns following. If you then saw a reliable pattern of differences in the stress of that word based on the following stress, you’d have a much tighter case for the necessity of lookahead related to Beat movement.

Speech errorsSome speech errors seem to involve switches across nonadjacent phonological words, as in with this wing I do red from Fromkin (1971), and you getter stop for bas from Keating and Shattuck-Hufnagel (2002). This could be interpreted as evidence for lookahead for more than word+1, but the possibility discussed above remains that the phonology of upcoming words being available actually signals a failure in the production process, not the normal case; this is perhaps more plausible for these nonadjacent switches than the word+1 errors, since they seem to be so much rarer than the adjacent case.

Phrase boundary placementSeveral authors have suggested that the placement of phrase boundaries depends on the length of the entire utterance. If true, this would clearly be a strong challenge to incrementality. The two findings cited by Keating and Shattuck-Hufnagel (2002) are Gee & Grosjean (1983) and Watson (2002). Gee and Grosjean (1983) argued that there is a tendency for intonational phrases to have equal lengths—this would mean that IP boundaries would come about halfway through the sentence, implying that the entire sentence length was known. Watson (2002) also showed that the length of upcoming material in an utterance influenced the placement of IP boundaries. A new piece of data comes from Krivokapić (in press), who finds that the length of the pause between at intonational phrase boundaries is dependent on both the length and the amount of prosodic branching in the following phrase.

If these findings hold, they clearly challenge the strictest, across-levels view of incrementality in production; however, it is less clear that they would challenge the idea of incrementality within the same representational level. We might imagine that an estimate of utterance length could be done fairly accurately based on the conceptual complexity of the message, or the syntactic representation; it’s not obvious that any of the prosodic or phonological encoding is necessary to capture these effects. Some authors (e.g. Krivokapić) suggest that their effects can be naturally interpreted as the result of processing load at the conceptual, syntactic, or prosodic levels. However, it’s also not clear that the experimental paradigm used by these studies, reading sentences, can tell us very much about the extent of lookahead in normal production. In reading, there are very clear visual cues to length, and oral readers who are even moderately experienced are likely to use these cues to create harmonious phrasing. Thus, although these studies can certainly give us insight into our mental representation of prosodic structure, it’s hard to take them as evidence for lookahead in normal production.

CayuvavaIn a report by Key (1961) on the dying indigeneous Bolivian language Cayuvava, the author indicated that in this language

Page 10: LOOKAHEAD IN PHONOLOGICAL PROCESSING ...ling.umd.edu/~ellenlau/papers/LOOKAHEAD_PHON_PROC.doc · Web viewPast this next word, however, while there are a lot of cases that have seemed

…tri-syllable stress group patterning marks strong stress from the final syllable of multi-syllable utterances. Strong stress occurs on the antepenultimate syllable and every third syllable preceding it. This appears to be a sentence-level pattern but it can be interrupted by any morpheme with phonemic strong stress. Thereafter the pattern will resume, counting from the interrupting strong-stressed morpheme.

If the language of Cayuvava really counted stress backwards from the end of a sentence, it would imply that the human production system at least has the capacity of having a phonological lookahead window limited only by the length of utterances, even if this capacity is not usually exercised. However, there is some reason to doubt that this pattern held strictly for longer utterances. Later in the same text Key notes that although the stress changes from ‘kita (water) to iIkiI ‘taA when in the utterance iki ‘ta pare ‘repeha (the-water is-clean), this does not always hold:

When, however, such a phrase is broken into smaller phrases, the stress patterns vary to fit the new smaller units; see the following example: ari ‘supuru kida ‘bukue (already-is-strong the-wind).

If we make the not-outlandish assumption that longer utterances will tend to be broken into shorter phrases in general in Cayuvava, then we can imagine that the window of lookahead required will be considerably shorter. We also see that longer utterances from a transcribed text in Key’s other work on Cayuvava (1967) do not follow the three-syllable stress pattern exactly, for example

hibi’hiine ‘rakoko’renapu pa’iraha (stress should fall on na instead of re)I-did-it-again to-toast-chivé fine

Unfortunately the data on Cayuvava is too sparse to make a conclusive determination, but in the absence of other evidence for utterance-length lookahead, it seems reasonable to think that we could capture the existing data with a combination of one-prosodic-word-lookahead, phrasal boundaries, and lexicalized stress (which restarts counting).

Prominence AssignmentKeating & Shattuck-Hufnagel (2002) mention a broad preference for placing a pitch accent as early in the first accented word of a new intonational phrase as possible, which they term Early Accent Placement (Shattuck-Hufnagel et al. 1994, Beckman et al. 1990). They suggest that this is a problem for strict incrementality. However, I think that this case is not so problematic; since the effect happens at the beginning of an intonational phrase, it can be triggered by the intonational phrase marker, and thus doesn’t require reference to anything further along in the string. There may be other aspects of prominence marking that require you to assign the entire intonational phrasing of the sentence at once, but that is another set of arguments—one of which, pitch declination, is presented below.

Page 11: LOOKAHEAD IN PHONOLOGICAL PROCESSING ...ling.umd.edu/~ellenlau/papers/LOOKAHEAD_PHON_PROC.doc · Web viewPast this next word, however, while there are a lot of cases that have seemed

Pitch DeclinationIn early work Maeda (1976) reported that pitch declination curves across an utterance decrease relatively linearly. Declination can be defined as a lowering of pitch range which begins from the initiation of the utterance, without regard to tonal description. Early studies found that the absolute amount of declination in an English utterance seemed to be independent of the utterance length—thus the slope of declination change had to be differing depending on length (Maeda, 1976; Sternberg et al., 1980; Pierrehumbert, 1979; Swerts, Strangert, & Heldner, 1996; but cf Sorensen & Cooper, 1980). However, if declination curves were linear, then the size of each step could only be determined with knowledge of the entire utterance length. Another observation that would seem to require lookahead is from Sorensen and Cooper’s (1979) study, in which speakers seemed to start off higher in pitch for longer utterances, to give them more space to decline; however, this was also a reading study and as far as I know this observation is unattested in natural production.

Some indirect evidence against the idea that declination is linear comes from a declination perception study by Pierrehumbert (1979). She found that while participants normalized for declination across an utterance, the amount by which they normalized did not differ when length differed by one to three syllables. However, this evidence must be taken somewhat lightly, as perception might not be sensitive to all the same factors governing production, and the length difference might also have been too small to cause a measureable difference in slope. In her thesis, Pierrehumbert (1980) notes in passing that the declination pattern looked slightly curved (p.130) although in her model she holds the slope constant to simplify her minimal-lookahead model of intonational downstepping. Thus, there may be reason to think that the declination curve could be modeled as a slow exponential, reducing at least the extent of the lookahead needed to model the data. Also, the generality of declination in production has since been questioned; Beckman & Pierrehumbert (1986) argue that much of declination effects can actually be explained better as other prosodic phenomena, and suggest that the small remaining amount of declination may be specific to certain information structure conditions that are met in isolated sentences. If the occurrence of declination is dependent on the discourse context, there may be support for the idea that length effects may be relative to ‘conceptual size’ rather than ‘phonological size’.

One clear way to settle the question would be to hold semantic and syntactic complexity constant but manipulate length through number of syllables or vowel duration. If manipulating length through these phonological variables affected the IP boundary placement or the declination curve, it would suggest that a considerable amount of lookahead within a representational level is in fact necessary to capture this phenomenon. I think it’s more likely that these factors would not have an effect, suggesting that lookahead through the conceptual or syntactic representation is enough to get a reasonable estimate for purposes of declination and boundary placement. One further possibility that should be examined with respect to the declination phenomenon is whether the curve is truly linear, or whether, when measured more accurately, it might be just a slow exponential—which wouldn’t require entire utterance-length information to achieve.

Page 12: LOOKAHEAD IN PHONOLOGICAL PROCESSING ...ling.umd.edu/~ellenlau/papers/LOOKAHEAD_PHON_PROC.doc · Web viewPast this next word, however, while there are a lot of cases that have seemed

CONCLUSION

In this paper I have briefly reviewed the evidence for lookahead in phonological production. In the first part I showed that there is pretty clear evidence that phonological production reflects lookahead to ‘higher’ levels of representation like conceptual, prosodic, and syntactic structure further on in the sentence, and this is consistent with the assumptions of most models. In the second part I examined the evidence for lookahead within the phonological level—metrical, segmental, and syllabic information—to determine the extent to which phonological information from upcoming words affects the production of the current word. There seems to be converging evidence from metrical phenomena and experimental work that the phonological information of the upcoming phonological word (word + 1) is available to potentially influence production of the current word. Past this next word, however, while there are a lot of cases that have seemed to suggest that phonological information even further down the line may be available to influence production of the current word, there seems to be no conclusive evidence as yet that this is true in normal production. Thus, for the purpose of building stress-assignment machines that can realistically model human compentence, my feeling is that it would be reasonable to allow lookahead within the current word and up to one phonological word forward; any further than that would not be well-founded unless it depended on something like higher-level intonational phrasing.

Page 13: LOOKAHEAD IN PHONOLOGICAL PROCESSING ...ling.umd.edu/~ellenlau/papers/LOOKAHEAD_PHON_PROC.doc · Web viewPast this next word, however, while there are a lot of cases that have seemed

BIBLIOGRAPHY

Alario, F.X., & Caramazza, A. (2002). The production of determiners: Evidence from French. Cognition, 82, 179-223.

Beckman, M., Swora, M., Rauschenberg, J. & de Jong, K. (1990). Stress shift, stress clash and polysyllabic shortening in a prosodically annotated discourse. In Proceedings of the 1990 International Conference on Spoken Language Processing, Kobe, Japan, 1, 5-8.

Beckman, M. & Pierrehumbert, J. (1986). Intonational structure in Japanese and English. Phonology Yearbook 3: 255-309.

Bell, A., Jurafsky, D., Fosler-Lussier, E., Girand, P., Gregory, M., & Gildea, D. (2003). “Effects of disfluencies, predictability, and utterance position on word form variation in English conversation.” Journal of the Acoustical Society of America 113 (2), 1001-1024.

Cooper, W.E. & Eady, S.J. (1986). Metrical phonology in speech production. Journal of Memory and Language, 25, 369-384.

Costa, A., & Caramazza, A. (2002). The production of noun phrases in English and Spanish: Implications for the scope of phonological encoding in speech production. Journal of Memory & Language, 46, 178-198.

Cutler, A. (1980). Syllable omission errors and isochrony. In H.W. Dechert & M.Raupach (Eds.), Temporal variables in speech. Studies in honour of Frieda Goldman-Eisler. The Hague: Mouton.

Ferreira, F. (1993). Creation of prosody during sentence production. Psychological Review, 100, 233-253.

Dell, G.S. (1986). A spreading activation theory of retrieval in sentence production. Psychological Review, 93, 283-321.

Fromkin, V. (1971). The non-anomalous nature of anomalous utterances. Language 47, 27-52.

Gee, J.P. & Grosjean, F. (1983). Performance Structures: A Psycholinguistic and Linguistic Appraisal. Cognitive Psychology 15, 411-458.

Hayes, B. (1984). The phonology of rhythm in English. Linguistic Inquiry 15 33-74.

Horne, M. (1990). Empirical evidence for a deletion formulation of the rhythm rule in English. Linguistics 28, 959-981.

Page 14: LOOKAHEAD IN PHONOLOGICAL PROCESSING ...ling.umd.edu/~ellenlau/papers/LOOKAHEAD_PHON_PROC.doc · Web viewPast this next word, however, while there are a lot of cases that have seemed

Keating, P. & Shattuck-Hufnagel, S. (2002). A Prosodic View of Word Form Encoding for Speech Production. UCLA Working Papers in Phonetics 101, 112-156.

Kelly, M.H., & Bock, J.K. (1988). Stress in time. Journal of Experimental Psychology: Human Perception and Performance, 14, 389-403.

Key, H. H. (1967). Morphology of Cayuva. The Hague/Paris: Mouton.

Key, H. H. (1961). Phonotactics of Cayuvava. International Journal of American Linguistics, 27(2), 143-50.

Krivokapić, J. (in press). Prosodic planning: Effects of phrasal length and complexity on pause duration. Journal of Phonetics.

Levelt, W.J.M. (1989). Speaking: From intention to articulation. Cambridge, MA: MIT Press.

Levelt, W.J.M., Roelofs, A., & Meyer, A.S. (1999). A theory of lexical access in speech production. Brain and Behavioral Sciences 22(1): 1-38.

Liberman, M. (1975). The intonational system of English. MIT dissertation.

Maeda, S. (1976). A Characterization of American English Intonation. MIT dissertation.

Meyer, A.S. (1990). The time course of phonological encoding in language production: The encoding of successive syllables of a word. Journal of Memory and Language, 29, 524-545.

Meyer, A.S. (1991). The time course of phonological encoding in language production: Phonological encoding inside a syllable. Journal of Memory and Language, 30, 69-89.

Meyer, A.S. (1996). Lexical access in phrase and sentence production: Results from picture-word interference experiments. Journal of Memory and Language, 35, 477-496.

Miozzo, M., & Caramazza, A. (1999). The selection of determiners in noun phrase production. Journal of Experimental Psychology: Learning, Memory, & Cognition, 25, 907-922.

Nespor, M. & Vogel, I. (1989). On clashes and lapses. Phonology 6, 69-116.

Pierrehumbert, J. (1979). The Perception of Fundamental Frequency Declination., Journal of the Acoustic Society of America 66, 363-369.

Pierrehumbert, J. (1980). The Phonetics and Phonology of English Intonation. MIT dissertation.

Page 15: LOOKAHEAD IN PHONOLOGICAL PROCESSING ...ling.umd.edu/~ellenlau/papers/LOOKAHEAD_PHON_PROC.doc · Web viewPast this next word, however, while there are a lot of cases that have seemed

Pullum, G.K., & Zwicky, A.M. (1988). The syntax-phonology interface. In F.J. Newmeyer (Ed.), Linguistics: The Cambridge Survey. Cambridge University Press.

Roelofs, A. (1998). Rightward incrementality in encoding simple phrasal forms in speech production: Verb-particle combinations. Journal of Experimental Psychology: Learning, Memory, and Cognition, 24, 904-921.

Selkirk, E. (1984). Phonology and syntax: the relation between sound and structure. MIT Press.

Shattuck-Hufnagel, S., Ostendorf, M., & Ross, K. (1994). Stress shift and early pitch accent placement in lexical items in American English. Journal of Phonetics 22, 357-388.

Sorensen, J., & Cooper, W. (1980). Syntactic coding of fundamental frequency in speech production. In Perception and Production of Fluent Speech, edited by R.A. Cole (Erlbaum, Hillsdale, NJ).

Sternberg, S., Wright, C., Knoll, R., & Monsell, S. (1980). Motor programs in rapid speech: Additional Evidence. In Perception and Production of Fluent Speech, edited by R.A. Cole (Erlbaum, Hillsdale, NJ).

Swerts, M., Strangert, E., Heldner, M. (1996). “F0 declination in spontaneous and read-aloud speech.” In Proceedings of the International Conference on Spoken Language Processing 3, 1501-1504.

Watson, D. & Gibson, E. (2004). Making sense of the sense unit condition, Linguistic Inquiry, 35, 508-517.

Wheeldon, L. R. & Lahiri, A. (1997). Prosodic units in language production. Journal of Memory and Language, 37, 356-381.