Early Association of Prosodic Focus with alleen ‘only ...uczaksz/Mulders and Szendroi...
Transcript of Early Association of Prosodic Focus with alleen ‘only ...uczaksz/Mulders and Szendroi...
fpsyg-07-00150 February 29, 2016 Time: 17:36 # 1
001
002
003
004
005
006
007
008
009
010
011
012
013
014
015
016
017
018
019
020
021
022
023
024
025
026
027
028
029
030
031
032
033
034
035
036
037
038
039
040
041
042
043
044
045
046
047
048
049
050
051
052
053
054
055
056
057
058
059
060
061
062
063
064
065
066
067
068
069
070
071
072
073
074
075
076
077
078
079
080
081
082
083
084
085
086
087
088
089
090
091
092
093
094
095
096
097
098
099
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
ORIGINAL RESEARCHpublished: xx March 2016
doi: 10.3389/fpsyg.2016.00150
Edited by:Judit Gervain,
Centre National de la RechercheScientifique – Université Paris
Descartes, France
Reviewed by:Bálint Forgács,
Université Paris Descartes, FranceAritz Irurtzun,
CNRS – IKER (UMR 5478), FranceAntje Sauermann,
Zentrum für AllgemeineSprachwissenschaft, Germany
*Correspondence:Kriszta Szendroi
[email protected];Iris Mulders
Specialty section:This article was submitted to
Language Sciences,a section of the journalFrontiers in Psychology
Received: 08 June 2015Accepted: 27 January 2016
Published: xx March 2016
Citation:Mulders I and Szendroi K (2016) Early
Association of Prosodic Focus withalleen ‘only’: Evidence from EyeMovements in the Visual-World
Paradigm. Front. Psychol. 7:150.doi: 10.3389/fpsyg.2016.00150
Early Association of Prosodic Focuswith alleen ‘only’: Evidence from EyeMovements in the Visual-WorldParadigmIris Mulders1* and Kriszta Szendroi2*
1 Utrecht Institute of Linguistics OTS, Utrecht University, Utrecht, Netherlands, 2 UCL Linguistics, Psychology and LanguageSciences, University College London, London, UK
In three visual-world eye tracking studies, we investigated the processing of sentencescontaining the focus-sensitive operator alleen ‘only’ and different pitch accents, suchas the Dutch Ik heb alleen SELDERIJ aan de brandweerman gegeven ‘I only gaveCELERY to the fireman’ versus Ik heb alleen selderij aan de BRANDWEERMANgegeven ‘I only gave celery to the FIREMAN’. Dutch, like English, allows accent shift toexpress different focus possibilities. Participants judged whether these utterances matchdifferent pictures: in Experiment 1 the Early Stress utterance matched the picture, inExperiment 2 both the Early and Late Stress utterance did, and in Experiment 3 neitherdid. We found that eye-gaze patterns start to diverge across the conditions alreadyas the indirect object is being heard. Our data also indicate that participants performanticipatory eye-movements based on the presence of prosodic focus during auditorysentence processing. Our investigation is the first to report the effect of varied prosodicaccent placement on different arguments in sentences with a semantic operator, alleen‘only’, on the time course of looks in the visual world paradigm. Using an operator in thevisual world paradigm allowed us to confirm that prosodic focus information immediatelygets integrated into the semantic parse of the proposition. Our study thus providesfurther evidence for fast, incremental prosodic focus processing in natural language.
Keywords: focus, semantics, marked stress, prosody, incremental language processing, eye tracking, visualworld paradigm, anticipatory eye movements and predictions
INTRODUCTION1
Prosodic Focus and Contrast: Pragmatic EffectFocus is an important information-structuring device. It occurs in every utterance, and itsignals to the hearer the most prominent part of the utterance: what is new, or what iscontrasted or highlighted. In many languages including English and Dutch, it is markedby prosodic prominence, specifically, with a pitch accent. Focus processing is crucial forcomprehension of utterances in context. To illustrate this, we can consider what makes a question–answer pair pragmatically felicitous. Capitals indicate prosodic stress and corresponding pitchaccent throughout. (1b), with prosodic accent on the direct object ‘some tea’ is a felicitous
1For ease of exposition, we illustrate the characteristics of focal stress and only with English examples. Everything we claimhere holds for Dutch in the same way.
Frontiers in Psychology | www.frontiersin.org 1 March 2016 | Volume 7 | Article 150
fpsyg-07-00150 February 29, 2016 Time: 17:36 # 2
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
Mulders and Szendroi Early Association of Prosodic Focus with ‘only’
answer to the question in (1a), while (1c) with prosodic accenton the indirect object ‘to the woman’ is not. (Rather, it would bea felicitous answer to a different question, namely ‘Who did yougive some tea to?’) This is because the question in (1a) asks forinformation about the object that was given to the woman, andthus expects the responder to prosodically highlight the directobject in their response.
(1) a. What did you give to the woman?b. I gave some TEA to the woman.c. #I gave some tea to the WOMAN.
Prosodic focus can also play an important role in determiningthe felicity of utterances in a non-linguistic context. An utterancelike (2), with contrastive accent on the modifying adjective, isonly felicitous in a context where not just green balls, but ballsof some other color are also present. The pragmatic function ofthe accent placement here is contrastive.
(2) Give me the GREEN ball.
Given the importance of prosodic accent placement forinformation structuring, many studies have tried to uncoverthe effect of prosodic prominence on language processing.Eberhard et al. (1995) were the first to report an experimentinvolving a reference resolution task in a real world settinginvolving prosodic prominence. The instructions involved eithercontrastive or neutral stress (Touch the LARGE/large blue square)on modifying adjectives in two different visual contexts. Theresults showed that in the contrastive stress condition, thelatency of eye movements to the target referent was significantlyshorter in the setting where contrastive stress was informative(i.e., in contexts with large and small blue squares) than inthe uninformative setting (i.e., in contexts with only largeblue squares). The eye movement latency was also shorter inthe contrastive stress condition compared to the unstressedcondition. So, contrastive stress facilitated reference resolution(but cf. Arnold, 2008).
The facilitatory effect of the contrastive L+H∗ accent inEnglish reference resolution tasks in contrastive contexts hasbeen further supported in a series of experiments by Ito andSpeer (2008, 2011). Here participants heard pairs of instructionslike Hang the blue ball with a second instruction followingbearing either neutral or contrastive stress, e.g., Next, hang thegreen/GREEN ball. In addition to confirming the processingadvantage of contrastive accents on the modifying adjective whenused in a contrastive context, Ito and Speer also demonstratedthat the use of such accents leads to anticipatory looks to thepreviously mentioned entity type (i.e., balls) and to ‘garden path’effects if used in infelicitous contexts (e.g., blue angel followedby GREEN ball). They thus demonstrated early interpretation ofcontrastive prosody.
Note, however, that in all these experiments, the referenceresolution task can also be carried out without the presence ofthe contrastive accent. In other words, were the instructions readout with a different intonation, the reference resolution task couldstill be carried out correctly. The presence of the contrastiveaccent is facilitatory and its absence informative, but ultimately, it
only has a pragmatic effect: it does not contribute to the sentencemeaning directly as it does not change the truth conditions of thesentence.
Prosodic Focus and Only: SemanticIntegrationProsodic focus placement is not only relevant for pragmaticfelicity of certain utterances in linguistic or non-linguisticcontext. Sometimes the position of the prosodic focus withinthe utterance directly contributes to the semantic meaning ofthe utterance. Sentences that involve the operator only are anexample of this.
(3) I only gave some tea to the woman.
In sentences involving an operator, like only, the prosodicfocus placement is not only relevant for pragmatic felicity ofthe utterance in context. Rather, the position of the prosodicfocus within the utterance directly contributes to the semanticmeaning of the utterance. In fact, the correct semantics cannotbe determined without accentual information. So, presentedin writing, (3) is ambiguous; its meaning depends on itsaccentuation pattern. The ambiguity can be resolved by prosody,as in (4).
(4) a. I only gave some tea to the WOMAN.= The only person Igave some tea to was the woman.
b. I only gave some TEA to the woman. = The only thing Igave to the woman was some tea.
In (4a), with pitch accent on the indirect object, only associateswith the stress-bearing indirect object, the woman, while in(4b), with stress on tea, only associates with the direct object,some tea. Accordingly, linguistic theories agree that the differentinterpretations in (4a) and (4b) are an indirect result of the twostress patterns; they arise because the operator only is focus-sensitive, meaning that it associates in its interpretation with thefocus of the utterance (Horn, 1969; Krifka, 1992; Rooth, 1992).Focus, in turn, is determined by main stress and correspondingpitch accent in English (Chomsky, 1971) and Dutch.2,3
2Note that (4a) with indirect object stress is in fact ambiguous in itself betweenthe readings indicated in (i) and (ii). This is not important for the present studyfor two reasons. First, adults strongly prefer the reading in (i), which is the onetargeted in this study (Crain and Steedman, 1985). Second, the experiments involvephonetically marked accent on the indirect object, which again makes the readingin (i) to be the preferred one.
(i) The only person I gave some tea to was the woman = indirect object focusreading
(ii) The only thing I did was give some tea to the woman =verb phrase focusreading
3Only cannot associate with just any focus-bearing element; the focus-bearingelement must be in its scope syntactically. Utterances like (3) are syntacticallydistinct from utterances like (i) in the sense that only in (3) is a verb-phrase-leveladverb, while it directly modifies the subject noun phrase in (i).
(i) Only [the WOMAN] gave a banana to the monkey.The kind of ambiguity that was present in (3) with only modifying the verbphrase, disappears in (i), because the operator only takes scope over its c-commanddomain, which is the verb phrase in (3) and the subject noun phrase in (i).Therefore, (i) can only have the reading exemplified in (iia).
(ii) a. The only person that gave a banana to the monkey was the woman.b. ∗The only event that took place was the woman giving a banana to the
monkey.
Frontiers in Psychology | www.frontiersin.org 2 March 2016 | Volume 7 | Article 150
fpsyg-07-00150 February 29, 2016 Time: 17:36 # 3
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
Mulders and Szendroi Early Association of Prosodic Focus with ‘only’
In order to be able to consider the psycholinguistic aspectsof processing only-sentences, we need to understand in a littlemore detail how the semantic meaning of such sentences isdetermined. Utterances with only can be decomposed into twoconjoined propositions (Horn, 1969; Krifka, 1992; Rooth, 1992).The first conjunct, (5b) and (6b), respectively, correspond tothe meaning of the proposition without only. This is called the‘presupposition’ or the ‘non-focal meaning component’. This partof the meaning is shared by the two utterances with indirectobject and direct object stress (5a and 6a). The second conjunctis entailed by the original only-sentence. It expresses the fact thatthe presence of only has the effect that the proposition does nothold for any other relevant alternatives. This part of the meaningis called the ‘assertion’ or the ‘focal meaning component’ (5c and6c).
(5) a. I only gave some tea to the WOMAN.b. Non-focal meaning component:
I gave some tea to the woman ANDc. Focal meaning component:
For all x [x 6= the woman], I did not give some tea to x.(6) a. I only gave some TEA to the woman.
b. Non-focal meaning component:I gave some tea to the woman AND
c. Focal meaning component:For all y [y 6= some tea], I did not give y to the woman.
As we can see from (5c) and (6c), it is the focal meaningcomponent that bears the semantic difference between the twoutterances with different prosodic accent placement. The non-focal meaning component is the same. So, it is the focal meaningcomponent that we need to target in our psycholinguisticinvestigations.
It helps to understand that the focal meaning component is infact a set of conjoined propositions. In the case of (5c), we canspell it out as in (7a), while (7b) corresponds to the focal meaningcomponent of the direct object stress utterance, (6c). The exactnumber of alternatives in each assertion set is determined by theactual context of the utterance. So, for instance, in (7a) we useda context where a man and a boy are present in addition to thewoman, while in (7b) we used a context where some coffee andbiscuits were available alongside the tea.
(7) a. {I didn’t give any tea to the man AND I didn’t give any teato the boy}
b. {I didn’t give any coffee to the woman AND I didn’t giveany biscuits to the woman}
Let us now turn to the psycholinguistic characteristics ofprocessing only-sentences. By studying the auditory processingof sentences like (3) we can investigate how fast prosodicfocal information gets integrated into the semantic parse of theutterance. In other words, as soon as we can detect evidence thatpeople can distinguish the meaning in (4a) from the meaningin (4b) in online auditory comprehension, we can concludethat they have processed the prosodic focal information andintegrated that information into the semantic parse of theutterance. Evidence of this can come from evidence of the
participants considering the focal meaning components in (5c)and (6c) or their equivalent set of propositions in (7a) and (7b).
There are two possibilities regarding the timing of thiscomputation. First, it is possible that the integration of focalprosody information is very fast and incremental. This wouldmatch the Ito and Speer (2008) findings about contrastivity.If so, one should see evidence of the non-focal meaningcomponent being considered at the earliest possible point. Giventhe semantics of only-sentences described above, the earliestpoint that the focal meaning component (i.e., 5c and 6c) canbe considered is when the proposition is complete. This is eventrue of utterances with early stress on the direct object, as in(6a). This is because even though in such utterances the prosodicfocus is available earlier, during the direct object, in order tointegrate that information into the semantic parse and computethe non-focal, and focal meaning components, it is necessaryto know the whole proposition, i.e., the verb and the indirectobject.
The second possibility is that semantic integration ofprosodic focus is considerably slower than pragmatic effectsof contrastivity. Perhaps due to the complex nature of thecalculations involved in the semantics of only-sentences (i.e.,non-focal and focal meaning components), it is possible thatevidence of the non-focal and focal meaning components beingconsidered would not emerge until well after the utteranceoffset, during wrap-up processing. Perhaps pragmatic effects ofcontrastivity would be manifest, as found by Ito and Speer (2008),at the point of the occurrence of the prosodic focus itself. Butsemantic integration of the prosodic focus information would bedelayed.
A number of reading studies have been done involving only-sentences. Paterson et al. (2007) compared reading times fordative sentences (and also double object constructions) wherethe position of the focus particle varied between a pre-directobject position (e.g., Jane passed only the salt to her mother)and a pre-indirect-object position (e.g., Jane passed the salt toonly her mother). In these constructions only associates with theimmediately adjacent noun phrase. In terms of the semanticsof only-sentences discussed above, this means that the focalmeaning components for the test sentences were Jane didn’tpass anything else to her mother and Jane didn’t pass the salt toanyone else, respectively. Accordingly, they used congruous vs.incongruous replacives as continuations to the sentences (suchas but not the pepper / but not her father) to determine whetherparticipants are sensitive to the placement of the focus particlewhen creating contrasts. This is based on the expectation thatif participants compute the relevant focal meaning componentby the time they encounter the replacives, they would findthem incongruous if they are mismatched. They found that theposition of only evoked the expected focus effect on-line (seealso Sauermann et al., 2013). This, however, manifested itself inlonger reading times for the postreplacive region, rather than thereplacive region itself. Paterson et al. suggested that ‘this delay[...] was attributable to the operation of inferential processes toevaluate the congruency of the supplied contrast’ with the focusstructure of the sentence (Paterson et al., 2007, p. 1440). Giventhis delay, it is not possible to determine whether the semantic
Frontiers in Psychology | www.frontiersin.org 3 March 2016 | Volume 7 | Article 150
fpsyg-07-00150 February 29, 2016 Time: 17:36 # 4
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
Mulders and Szendroi Early Association of Prosodic Focus with ‘only’
integration of focus is itself late, or if it happens earlier, but ismasked by the delay caused by the inferential process involvedin determining the focus in the absence of direct prosodicinformation.
Another study that involved sentences with only is the self-paced reading experiment by Crain et al. (1994), extended bySedivy (2002) (cf. Paterson et al., 1999; Clifton et al., 2000;Liversedge et al., 2002; Filik et al., 2005). This study investigatedvariants of established garden-path sentences involving only:
(8) a. Businessmen loaned money at low interest were told torecord their expenses.
b. Only businessmen loaned money at low interest were toldto record their expenses.
The results indicated that the presence of only amelioratesthe garden path effect. This effect is consistent with a scenariowhere participants create a contrast set based on the presence ofthe operator, prompting them to build the appropriate reducedrelative clause structure already before the disambiguating mainclause verb were told appears. The amelioration of the gardenpath effect disappears again when a contrast set is given inthe discourse. We can take this as evidence that only promptsreaders to generate a contrast set (if there is none in the context)at some point before the disambiguating main clause verb.But we do not know exactly at what point it happens beforethen.
Overall, while these reading studies indicate that focusinformation is used during processing, by their nature readingstudies cannot be informative about the disambiguating role ofstress, as stress is generally not marked in writing. Furthermore,reading studies typically tap into the analysis that participantsmake by disconfirming this analysis later on in the text,measuring a resulting slowdown effect at that point; this meansthat there can always be a gap between the point in time wherethe analysis was made by the participant and when we detect itseffects.
The visual world paradigm can give precise information aboutthe interpretation of the sentence at each point in time duringthe sentence. Gennari et al.’s (2005, p. 250) measured responsetimes and overall fixation patterns in a visual-world paradigm,using a visual setup involving three people: for instance, a woman,a man and a boy. In the picture, the boy had a glass of milk infront of him, the man had a glass of milk and a cup of coffee.The woman, standing in the background, was holding a tray witha milk carton and a teapot. Participants heard utterances eitherwith neutral stress on the indirect object (like 9a) or with markedstress on the direct object (9b) in a picture verification task.
(9) a. The mother only gave some milk to the boy. Neutral stressFALSE
b. The mother only gave some MILK to the boy. Markedstress TRUE
Gennari et al. (2005) proposed that ‘marked stress is usedimmediately by the parser to decide which noun phrase bearssemantic focus and, therefore, which contrast set should beinvoked for sentence interpretation’. In other words, they
proposed that focus processing is fast and incremental in only-sentences. They reached their conclusion based on their findingthat in the Neutral Stress condition, there were fewer correctresponses than in the Marked Stress condition (MS: proportionof correct responses 0.84, SD: 0.18; NS: 0.70, SD: 0.19). Notethat this is an indirect reasoning: there could be many reasonswhy the number of correct responses was lower in the NeutralStress condition that have nothing to do with the potential earlyintegration of Marked Stress information. They did not find aresponse time difference between the two conditions (Gennariet al., 2005, p. 254). Note, however, that the expected responsesdiverged in the two conditions (MS: TRUE, NS: FALSE). It ispossible that this influenced response times because it may takelonger or shorter to verify a proposition than to falsify it. Therewas also a qualitative difference between the phonetic salience ofneutral and marked stress, which may have boosted participants’performance in the Marked Stress condition.
Gennari et al. (2005) only report overall proportion of lookson the various entities treating the entire utterance and the timebetween the utterance offset and the participants’ response as onesingle time window. They found that the ‘boy’s milk’ draws asignificantly higher proportion of looks when it bears contrastivestress (i.e., MS) compared to when it does not (i.e., NS).4
However, the different pattern of looks across the conditions canonly be interpreted as evidence for early, incremental effect offocus if they are time-locked to the appearance of the prosodicfocal information in the auditory input. To establish this, onewould need to know not only the overall fixation patterns,as provided by Gennari et al. (2005), but also how the eyemovements progress as the sentence unfolds. To sum up, Gennariet al. (2005) found that the number of correct responses washigher in the Marked Stress condition, but no difference inresponse times. They also found that entities bearing contrastivefocus are targeted more by eye gaze if overall looks are considered,raising the possibility that a pragmatic effect of contrast occursearly. They did not investigate the time course of semanticintegration of prosodic focal information.
To sum up, we have seen that prosodic focus, as aninformation structuring device, often has pragmatic effects, i.e., itmakes certain utterances felicitous or infelicitous in context. Onesuch effect is contrastivity, which was investigated by Eberhardet al. (1995) and Ito and Speer (2008, 2011). What they found
4They further report that significantly more looks targetted what they labelled‘contrast’ entities, namely ‘the man’s coffee (as well as on the set of contrastingelements such as the teapot taken as a whole)’ (Gennari et al., 2005, p. 256), inMarked Stress than in Neutral Stress. Unfortunately, it is unclear whether thecombined looks to ‘contrast entities’ includes looks to ‘the man’s milk’, which isa crucial object for determining the truth value of the sentence The mother onlygave some milk to the boy. Whether or not the ‘man’s milk’ is included in whatGennari et al. (2005) called ‘contrast entities’, it is difficult to interpret the increasedlooks to these entities in the Marked Stress condition. It is unexpected in the lightof the semantics outlined above (see examples 5–7), since neither the man’s coffeenor the woman’s teapot plays any role in either the non-focal or the focal meaningcomponent in the Marked Stress condition. For establishing the truth value of thesentence The mother only gave some MILK to the boy, only the boy’s possessions arerelevant. At the same time, it is also hard to interpret this increase as manifestationof the pragmatic effect of contrastivity demonstrated by Ito and Speer (2008, 2011).Ito and Speer (2008, 2011) found that looks increase to the target entities whencontrastive accent is used appropriately, not to the entities contrasting with thetarget entity.
Frontiers in Psychology | www.frontiersin.org 4 March 2016 | Volume 7 | Article 150
fpsyg-07-00150 February 29, 2016 Time: 17:36 # 5
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
Mulders and Szendroi Early Association of Prosodic Focus with ‘only’
was that contrastive prosody facilitates reference resolution andthat the contrastive information is interpreted with respect tothe actual context. But focus can also directly contribute to thesemantics of the utterance if an operator such as ‘only’ is presentin the utterance. In such sentences, one simply cannot computethe full meaning of the utterance without knowing where theprosodic focus is. So in such sentences, focus has a semanticeffect, not just a pragmatic one. Semantic integration of focuswas investigated by Paterson et al. (2007), but in a reading study,so the position of the focal prosody is only inferred, which mayhave contributed to the observed delay in the integration of theprosodic focus information into the semantic parse.
Our StudyWe investigated utterances with alleen ‘only’ with different focalaccent placements in a visual world paradigm, such as the DutchIk heb alleen SELDERIJ aan de brandweerman gegeven ‘I onlygave CELERY to the fireman’ versus Ik heb alleen selderij aan deBRANDWEERMAN gegeven ‘I only gave celery to the FIREMAN’.Our objective was to detect the earliest point that participants’ eyegaze patterns give evidence that they consider the propositionsthat make up the focal meaning component of the utteranceswith different focal accents. Our study thus reveals how fastdifferent focal accent placements on the arguments of the verbget integrated into the semantic parse of the utterance duringauditory comprehension. We carried out three visual-worldparadigm experiments to investigate these issues, measuringresponse times and the time course of eye movements. InExperiment 1 the divergent expected responses (Early Stress:YES; Late Stress: NO) corresponded to the Gennari et al. (2005)study to maximize chances of comparison. In Experiment 2, thevisual stimulus was adapted in such a way that the expectedresponse was YES in both conditions. In Experiment 3, thevisual stimuli were changed to trigger NO responses in bothconditions.
Our hypothesis was that if participants integrate prosodic focalinformation immediately, their looks will reflect the semanticparsing of the utterance during utterance comprehension. Thealternative position is that only the pragmatic effect of contrastis fast, while semantic integration of prosodic focus into theparse only appears later, during wrap-up computation. In orderto determine the expected looks during sentence verification, letus apply Rooth’s (1992) semantics to the specific example fromour experiments to the visual scene seen in Experiment 1, seeFigure 1.
(10) Early Stress (ES) condition:
a. Example in English: I only gave CELERY to the firemanb. Non-focal meaning: I gave celery to the firemanc. Focal meaning: I did not give anything else to the
fireman = {I didn’t give x to the fireman AND I didn’tgive y to the fireman AND I didn’t give z to the fireman. . . , where x, y, z, . . . are objects that could have beengiven to the fireman in the context}
d. Potentially relevant entities for verification of focalmeaning in visual context: fireman and his objects
FIGURE 1 | Example of visual stimulus for Experiment 1.
(11) Late Stress (LS) condition:
a. Example in English: I only gave celery to the FIREMANb. Non-focal meaning: I gave celery to the firemanc. Focal meaning: I did not give celery to anyone else = {I
didn’t give celery to x AND I didn’t give celery to y ANDI didn’t give celery to z . . . , where x, y, z, . . . are peoplethat a celery could have been given to in the context}
d. Potentially relevant entities for verification of focalmeaning component in visual context: any other personin the picture and their objects
e. Falsifying proposition in Experiment 1: I gave celery tothe diver.
f. Entities relevant for the falsifying proposition inExperiment 1: diver, diver’s celery
Concerning the time course of the verification procedure thefollowing predictions hold. The earliest point at which the focalor non-focal meaning components can be verified is when theproposition is complete. Given that the verb is predictable in ourexperiment, we may actually find verification of the focal (andnon-focal) meaning component to start at the indirect objectdirectly preceding it. In the ES condition, marked stress and thusfocus can be identified earlier: at the point when the direct objectis heard. But note that the focal meaning components cannot yetbe computed at this point. When the participant hears I only gaveCELERY to the. . . the sentence could end in two different ways.Either the indirect object turns out to be the fireman, as in ouractual example, in which case the utterance matches the picture,or the indirect object could turn out to be the diver, in whichcase the utterance would not match the picture. (In principle, the
Frontiers in Psychology | www.frontiersin.org 5 March 2016 | Volume 7 | Article 150
fpsyg-07-00150 February 29, 2016 Time: 17:36 # 6
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
Mulders and Szendroi Early Association of Prosodic Focus with ‘only’
utterance could also end with the indirect object the doctor, butthis would constitute an infelicitous utterance. Since the doctorhas no celery in the picture, the non-focal meaning componentwould not be true.) For this reason, we do not expect a differencein response times across the conditions. In terms of expectednumber of correct responses, we do not expect a differencebetween the conditions either. Both the ES and LS conditionscontain prosodically marked, contrastive stress associated withonly so there is no reason to expect that either condition wouldbe easier to perform than the other.
Given the basic linking hypothesis that auditory inputguides visual attention, one would expect looks to targetthe entities mentioned in the utterances, i.e. the firemanand any celery when these words are heard. Looks to theseentities could also reflect verification of the non-focal meaningcomponent, which directly corresponds to the utterance withoutonly.5
The main goal of this paper is to investigate the effectsof prosodic focal accent on the direct object versus theindirect object in only-sentences. For this reason, we will beprimarily interested in the looks that can be attributed tothe different focal meanings. The looks required to verify thefocal meaning components differ across the two conditions.Let us take the ES condition first. In order to verify I didn’tgive anything else to the fireman, looks need not shift awayfrom the fireman and his plates. Participants simply need tocheck that the fireman does not have other objects beside hiscelery.
In contrast, in the LS condition, the focal meaning componentis I didn’t give celery to anyone else. Verification of thisproposition would require checking that the propositions makingup the focal meaning component, i.e., (11c), are all true. Thiswould mean looking at any people in the picture and their objectsbecause these are potentially relevant entities for determiningthat the fireman is the only person who received a celery (see11d). Since the utterance with late stress is actually false inExperiment 1, one of these propositions is false. The actualfalsifying proposition in the visual context is ‘I gave celery tothe diver’ (see 11e). In other words, it is due to the fact thatthe diver has a celery in the picture, that the utterance doesnot match the picture. For this reason, we expect that lookswill target the diver and the diver’s celery. Without lookingat these entities, it is simply impossible to reach the correctresponse. In addition, looks targeting the diver’s corn duringthe computation of the response in the LS condition are alsoconsistent with our predictions but are not necessary to reach acorrect response. While the diver’s corn is irrelevant for the focalmeaning component, participants may look at it to verify that itis not celery. In addition, participants might target the doctor, tosee if he has any celery. Since in our specific example the doctor’splates are empty and the emptiness of a plate is easy to identify
5We remain agnostic as to whether participants would actually verify the truthof the non-focal meaning component. Kim (2008) found that presuppositionalmeaning is sometimes verified, and sometimes not depending on the task and thesaliency of the entities and the grammatical encoding of the elements. Note that ifthe focal meaning is verified, the verification can only take place after the indirectobject has been heard.
parafoveally, it is expected that the doctor’s empty plates are notdirectly targeted by looks.
To sum up, our hypothesis is that once participants proceedto verify the focal meaning component, we expect that lookswill diverge across the two conditions. In particular, we predictthat in the ES condition looks will stay on the fireman and thefireman’s celery, while in the LS condition, looks will shift to thediver, the diver’s celery and to a lesser extent to the diver’s corn(and perhaps even to the doctor). This could take place at theearliest during the indirect object or during the sentence finalverb gegeven if semantic integration of prosodic focus is fast, andmay occur after the utterance offset if semantic integration ofprosodic focus is slower. If looks diverge in the predicted wayalready at the point of the indirect object, we would take that tobe evidence for fast, incremental semantic integration of prosodicfocus information.
There is one additional specific conclusion that we can draw,if our predictions are born out, irrespective of the fast or slownature of the semantic integration. The proposed findings wouldconstitute evidence that the verification process corresponds tothe semantics associated with the utterance (see Horn, 1969;Krifka, 1992; Rooth, 1992 and discussion above). In principle,one may imagine that instead of looks corresponding directlyto the focal meaning component participants could engage inheuristic strategies. For argument’s sake, one may hypothesizefor instance that in order to verify an utterance involving only itwould be enough to look for the falsifying entity. So, in I onlygave celery to the FIREMAN looks could target any celery inthe picture that does not belong to the fireman. The participantcould legitimately reject the utterance without having verifiedthat this offending celery in fact belongs to the diver. In otherwords, looks to the falsifying entity are logically necessary forfalsification, but looks to the possessor of that falsifying entity arenot. If we find looks targeting the diver too, that would supportthe hypothesis that sentence verification follows the proposition-based semantics associated with only-sentences. In other words,we take looks to the diver (in addition to looks to the diver’scelery) as an indication that participants do not simply look for anoffending object, but attempt to verify the relevant proposition ofthe focal meaning component, I didn’t give celery to anyone else,falsified by the proposition I gave celery to the diver. Naturally,this evidence is only indirect. We cannot be sure that looks to thediver necessarily mean that participants entertain the falsifyingproposition. But the implication holds the other way: anyoneentertaining the falsifying proposition would have to look at thediver as well as his celery.
EXPERIMENT 1
Materials and MethodsParticipantsTwenty adult participants were recruited from the UiL OTSparticipant pool, which is largely made up of undergraduatestudents from Utrecht University. All participants were non-dyslectic native speakers of Dutch. Participants were unawareof the purpose of the experiment, and were paid 5€ for their
Frontiers in Psychology | www.frontiersin.org 6 March 2016 | Volume 7 | Article 150
fpsyg-07-00150 February 29, 2016 Time: 17:36 # 7
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
Mulders and Szendroi Early Association of Prosodic Focus with ‘only’
participation. The mean age of the participants was 22;8 years(range: 19–29); 18 participants were females; 17 were right-handed. This study was carried out in accordance with researchethical laws of the Netherlands with informed consent from allsubjects.
MaterialsSixteen items were constructed in two conditions. Figure 1 aboveshows an example scene for both conditions. There are threepersons in the picture; a diver, a doctor and a fireman in thisexample. The persons all hold large plates on their left and rightside. Some of the plates are empty and some contain food or drinkitems, a celery or a corn cob in this example. In particular, thediver has a plate with a celery and one with a corn cob, while thefireman has a plate with a celery and an empty plate.
As shown by the example test items in (12) the expectedresponse in the ES condition was YES, while it was NO in theLS condition. This allowed us to determine whether participantsreached the correct response depending on the prosody of theitem. This also allowed us to have results that are comparableto Gennari et al.’s (2005) study, although this difference doesintroduce a potential confound for response time measurements.
The visual scenes were designed using a 3 × 3 grid design,the distance between any two objects is identical and would besufficiently large to be well-suited for eye-tracking evaluation.The pairs of objects in the pictures – corn and celery in Figure 1 –were chosen to match in size, shape, and gray value, to makesure participants shift their gaze to them and not identify themparafoveally while looking at the person in the middle.
The audio stimuli corresponding to the visual scene inFigure 1 are in (12). The full items list is in Appendix 3.
(12) a. ES condition: Expected answer: YESIk heb alleen SELDERIJ aan de brandweerman gegevenI have only celery to the fireman given‘I only gave celery to the fireman.’
b. LS condition: Expected answer: NOIk heb alleen selderij aan de BRANDWEERMAN gegevenI have only celery to the fireman given‘I only gave celery to the fireman.’
For practical reasons, the experiment was performed in Dutch.Dutch prosody is sufficiently similar to English prosody to allowcomparison with previous work in English. Both languages markcontrastive focal accent by enhanced duration and H∗L pitchaccent. Stress placement within an utterance is free to matchthe focus within the syntactic scope of the semantic operator.Relevant phonetic details for the examples in (12) are in TableS1 in Appendix 1.
Verbal stimuli were pre-recorded by a female native speakerof Dutch. They were checked for the placement of pitch accentsusing PRAAT (Boersma and Weenink, 2006); for examples seeFigures 2 and 3.
The names of target objects and people that were usedin the sentences were matched in length. They were all atleast three syllables long. There were no significant differencesbetween conditions in the overall lengths of the audio stimuli[t(16)= 0.95, p= 0.925].
FIGURE 2 | Pitch track for example of Late Stress (LS) Conditionutterance.
FIGURE 3 | Pitch track for example of Early Stress (ES) Conditionutterance.
Although the utterances as a whole do not differ in lengthin the two conditions, there are slight differences in lengthbetween conditions in certain auditory segments: the directobject segment was slightly longer when it was stressed (in theES condition), as was the indirect object segment in the LScondition. These differences canceled each other out overall,since each condition contains exactly one segment with markedstress.
Ninety-eight filler items were constructed including variousquantifiers (e.g., niet iedereen ‘not everybody’). The fillers werebalanced for YES/NO expected responses. To match our testitems, the fillers either involved early marked stress on the directobject or late marked stress on the indirect object. The fillersincluded a set of 32 control items involving alleen, 16 with earlyand 16 with late stress, where the expected response was differentfrom the expected response of the corresponding test condition.This would discourage people from developing a strategy relatingthe position of the accent to the expected response in thetest items (i.e., early stress = YES; late stress = NO). Finally,half of these control items referred to the ‘doctor’ (i.e., tothe person in the middle in the visual stimulus), so thatparticipants do not disregard the middle person in the picturein general. A list of the type of fillers used is in Table S2 inAppendix 1.
Furthermore, we controlled for potential confounds causedby the spatial location of objects in the picture by varyingtheir positions on the top-down and the left-right axes. The
Frontiers in Psychology | www.frontiersin.org 7 March 2016 | Volume 7 | Article 150
fpsyg-07-00150 February 29, 2016 Time: 17:36 # 8
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
898
899
900
901
902
903
904
905
906
907
908
909
910
911
912
Mulders and Szendroi Early Association of Prosodic Focus with ‘only’
falsifying entity for the LS condition (the diver’s celery in theexample in Figure 1) appeared in four different positions: infour items it was located in the top left-hand corner of thepicture (as in Figure 1); four times it appeared in the top right;four times in the bottom right; and four times in the bottomleft.
ProcedureThe participants were tested individually in a sound-treatedbooth. Prior to the experiment, they read an instruction sheet,which included the setting of the experiment (see Appendix2 for a translation). This provided a context in which bothutterances with early stress and utterances with late stress wouldbe pragmatically felicitous. The participants’ task was to indicatewhether the sentence matched the visual scene by pressing abutton on a button box, using the dominant hand to give a YESresponse, and the non-dominant hand to give a NO response.This might have introduced a response time bias in favor of theES condition.
The experiment was programmed in FEP (Veenker, 2005). Eyemovements of the participants’ right eye were recorded with anEyeLink 1000 eye tracker in remote mode using a target stickerto track head movements, at a 500 Hz sampling rate. Participantswere seated at a distance of 600–650 mm from the screen wherethe visual image was presented; the height of the participants’chair was adjusted to get an optimal image of the eye.
After the experimenter had ensured a clear image of thepupil, corneal reflection, and target sticker, the experimenter leftthe participant booth and a 13-point calibration and validationprocedure was initiated from the control room. These wererepeated until the experimenter was satisfied that they weresuccessful. Every stimulus was preceded by a fixation target inthe middle of a blank screen. An automatic drift check wasapplied as the participant fixated this fixation target and arecalibration initiated if the drift check indicated a drift of morethan 20 pixels. Participants were allowed 1000 ms to explorethe visual scene before the utterance was presented. The wholeprocedure, including instruction and calibration, took about20 minutes for each participant.
After successful calibration, the participants were exposedto a practice block of 12 practice items (fillers, 2 of thoseresembling experimental items), to familiarize them with the task.The practice block was followed by a small pause in which theparticipants could ask questions about the task (if necessary).After this, the experiment would start. The remaining 118 trials(32 test items, 32 controls, 54 fillers) were presented in two blocks;each block was preceded by a calibration.
All the names of the persons and objects involved in theexperimental items were mentioned in the first 16 filler trials(including the practice block), to ensure that the participants hadseen them and knew what they were called.
All participants saw all the test items in both conditions.The items and fillers were presented to the participants in apseudo-randomized order where an experimental item neverdirectly followed another experimental item in any condition;of the (experimental or filler) items involving alleen ‘only’,the trials with late stress never followed a trial with early
stress or vice versa; and experimental items never followeda filler involving alleen ‘only’ with any stress pattern. Nomore than three trials with the same stress pattern occurredsuccessively.
ResultsFor the response data, two experimental trials belonging to thesame participant were removed because the answer had alreadybeen given before onset of the indirect object. In addition, onefiller trial was removed because the answer had been given beforesentence onset.
Number of Correct ResponsesThe percentage of correct responses for the LS condition was98%, for the ES condition 99%. The difference was not significant[F1(1,19) = 2.923, p = 0.104, η2
p = 0.133). The overall correctresponse rate for the experiment was 98%.
Response TimeThe overall mean response time from utterance onset for the LScondition was 3034 ms, while it was 3048 ms for the ES condition.The difference is not significant [F1(1,19) = 0.027, p = 0.871,η2
p = 0.001].So, we did not find a significant effect in response time or
accuracy across the conditions.
Eye Gaze PatternsCoding and analysisWe identified six Areas of Interest, the ‘fireman’, the ‘fireman’scelery’, the ‘diver’, the ‘diver’s celery’, the ‘diver’s corn,’ and the‘doctor’. See Figure 1. Fixations were assigned to the AoI theyoccurred on. For ease of reference, we refer to the AoIs with thecontent of the example stimulus in Figure 1; the data and plotsthat we give are calculated over all items and participants, so whenfor instance we say ‘fireman’, we mean ‘the person in the picturewho is part of the non-focal proposition in all the items’.
For the fine-grained analysis of eye movements over time,we divided the utterances into relevant audio segments,and determined the onset of each segment using PRAAT.The sentence segments are: selderij/SELDERIJ ‘celery/CELERY’;aan de ‘to the’; BRANDWEERMAN/brandweerman ‘FIREMAN/fireman’; gegeven ‘given’. For each segment, we analyzed thefixation samples falling between 200 ms after segment onset,and 200 ms after the offset of that auditory segment, to takeinto account that it takes 200 ms to launch a saccade drivenby linguistic input (cf. Altmann and Kamide, 2004). The finalthree segments comprise the interval starting at 200 ms afterthe onset of the verb gegeven ‘given’ and ending 1500 ms later.This was divided into three segments of identical length. Forease of reference we call these the auditory segment gegeven‘given’, the ‘first 500 ms interval after offset’ and the ‘second500 ms interval after offset’. For reference, the average durationsof the utterance components are given in Table S3 in Appendix1.
For each experimental trial, we computed the proportionof time the participant spent fixating each area of interest ineach auditory segment. We averaged these proportions over
Frontiers in Psychology | www.frontiersin.org 8 March 2016 | Volume 7 | Article 150
fpsyg-07-00150 February 29, 2016 Time: 17:36 # 9
913
914
915
916
917
918
919
920
921
922
923
924
925
926
927
928
929
930
931
932
933
934
935
936
937
938
939
940
941
942
943
944
945
946
947
948
949
950
951
952
953
954
955
956
957
958
959
960
961
962
963
964
965
966
967
968
969
970
971
972
973
974
975
976
977
978
979
980
981
982
983
984
985
986
987
988
989
990
991
992
993
994
995
996
997
998
999
1000
1001
1002
1003
1004
1005
1006
1007
1008
1009
1010
1011
1012
1013
1014
1015
1016
1017
1018
1019
1020
1021
1022
1023
1024
1025
1026
Mulders and Szendroi Early Association of Prosodic Focus with ‘only’
TABLE 1 | Analysis of variance per auditory segment for Experiment 1.
Auditory segment Factor df1 df2 F1 p η2p
Direct object AoI 3.689a 70.087a 4.814 0.001∗ 0.202
Condition 1 19 4.682 0.043∗ 0.198
AoI∗condition 5 95 2.113 0.071† 0.100
aan de ‘to the’ AoI 3.415a 64.885a 24.715 0.000∗ 0.565
Condition 1 19 0.223 0.642 0.012
AoI∗condition 3.729a 70.849a 2.747 0.023∗ 0.126
Indirect object AoI 4.148a 78.805a 104.798 0.000∗ 0.847
Condition 1 19 5.586 0.029∗ 0.227
AoI∗condition 4.032a 76.607a 8.793 0.000∗ 0.316
gegeven ‘given’ AoI 3.015a 57.278a 65.559 0.000∗ 0.775
Condition 1 19 0.950 0.342 0.048
AoI∗condition 5 95 28.532 0.000∗ 0.600
0–500 ms after offset AoI 3.064a 58.217a 30.372 0.000∗ 0.615
Condition 1 19 1.438 0.245 0.070
AoI∗condition 2.210a 41.998a 19.112 0.000∗ 0.501
501–1000 ms after offset AoI 2.412a 45.828a 16.588 0.000∗ 0.466
Condition 1 19 0.395 0.537 0.020
AoI∗condition 2.884a 54.803a 3.907 0.003∗ 0.171
aHuynh-Feldt corrected.
TABLE 2 | Proportions of time spent fixating each of the relevant people and objects in Early Stress (ES) and Late Stress (LS) condition for each auditorysegment, and pairwise comparisons between conditions with Bonferroni correction applied for Experiment 1.
Auditory segment Person/object Proportion of time looking in ES Proportion of time looking in LS F1(1,19) p η2p
Indirect object Diver 0.10 0.14 10.557 0.004∗ 0.357
Diver’s celery 0.04 0.10 15.540 0.001∗ 0.450
Fireman 0.42 0.33 14.756 0.001∗ 0.437
gegeven ‘given’ Diver 0.05 0.14 29.426 0.000∗ 0.608
Diver’s celery 0.04 0.12 13.907 0.001∗ 0.423
Fireman 0.42 0.24 83.212 0.000∗ 0.814
Fireman’s celery 0.19 0.12 9.824 0.005∗ 0.341
0–500 ms after offset Diver 0.04 0.10 14.356 0.001∗ 0.430
Diver’s celery 0.02 0.08 25.298 0.000∗ 0.571
Diver’s corn 0.01 0.04 8.803 0.008† 0.317
Fireman 0.29 0.13 20.688 0.000∗ 0.521
Fireman’s celery 0.13 0.08 6.946 0.016† 0.268
participants for the ES and LS conditions. We carried out six two-way repeated measures ANOVAs (using SPSS 22), one for eachauditory segment, using Visual AoI and Condition as the twomain factors. The results are reported in Table 1. We also provideeffect size measures (η2
p).We found a significant main effect for AoI in all auditory
segments, meaning that participants look more to some AoIsthan to others in all auditory segments. We also found asignificant main effect for condition in the direct object andthe indirect object auditory segments, which means that onecondition has a more unequal distribution of the proportionof looks than the other. Since they are not relevant toour research question, we do not investigate these maineffects further. Although these may well contain interestinginformation about how our test sentences are processed, in
this paper, we concentrate on the differences in eye gazepatterns that can be attributed to the prosodic differenceacross the conditions, which is manifested in the interactionsbetween Visual AoI and Condition. This is consistent withour intention to investigate whether (and if so when) thereis any effect of the position of prosodic focus (i.e., earlyversus late) on the eye gaze patterns associated with auditorysentence processing. Thus, the most relevant findings are thatwe found significant interactions of Visual AoI and Conditionfor the aan de ‘to the’, the indirect object, the verb (gegeven‘given’) time segments, and for the first and second 500 msinterval after utterance offset. These are indicated in bold inTable 1 for ease of reference. For these auditory segments,we carried out pairwise comparisons, applying Bonferronicorrections, to reveal which AoIs were targetted by eye fixations
Frontiers in Psychology | www.frontiersin.org 9 March 2016 | Volume 7 | Article 150
fpsyg-07-00150 February 29, 2016 Time: 17:36 # 10
1027
1028
1029
1030
1031
1032
1033
1034
1035
1036
1037
1038
1039
1040
1041
1042
1043
1044
1045
1046
1047
1048
1049
1050
1051
1052
1053
1054
1055
1056
1057
1058
1059
1060
1061
1062
1063
1064
1065
1066
1067
1068
1069
1070
1071
1072
1073
1074
1075
1076
1077
1078
1079
1080
1081
1082
1083
1084
1085
1086
1087
1088
1089
1090
1091
1092
1093
1094
1095
1096
1097
1098
1099
1100
1101
1102
1103
1104
1105
1106
1107
1108
1109
1110
1111
1112
1113
1114
1115
1116
1117
1118
1119
1120
1121
1122
1123
1124
1125
1126
1127
1128
1129
1130
1131
1132
1133
1134
1135
1136
1137
1138
1139
1140
Mulders and Szendroi Early Association of Prosodic Focus with ‘only’
FIGURE 4 | Time course of the mean fixation time proportions for each AoI in Experiment 1.
differently, at any segment, across the two conditions. Thesignificant results are reported in Table 2. Statistically significantdifferences are indicated with an asterisk, and marginallysignificant effects are marked with a dagger throughout thepaper.
Figure 4 gives the time course of the mean fixation timeproportions for each AoI in each auditory segment in Experiment1. The figure contains 8 plots corresponding to the nine cells inour 3 × 3 visual scene (the ‘doctor’s empty plates’ are plottedtogether). In each plot we give the mean proportion of timespent looking at that cell (e.g., the ‘fireman’ etc.) during eachauditory segment in the two conditions (ES vs. LS). The error barsindicate±2 SE.
DiscussionBehavioural MeasuresWe found a high number of correct responses in both conditionsand no significant difference in the number of correct judgmentsbetween the conditions. We speculate that the reason why wegot a higher number of correct responses than Gennari et al.(2005) may have been because their experiment involved a morerealistic and more complex visual scene, while ours was a stylised3 × 3 design, and because we provided an overall context story,while participants in the Gennari et al. (2005) study heard itemswithout context.
Like Gennari et al. (2005), we did not find any significantdifferences in response times between the conditions.
Frontiers in Psychology | www.frontiersin.org 10 March 2016 | Volume 7 | Article 150
fpsyg-07-00150 February 29, 2016 Time: 17:36 # 11
1141
1142
1143
1144
1145
1146
1147
1148
1149
1150
1151
1152
1153
1154
1155
1156
1157
1158
1159
1160
1161
1162
1163
1164
1165
1166
1167
1168
1169
1170
1171
1172
1173
1174
1175
1176
1177
1178
1179
1180
1181
1182
1183
1184
1185
1186
1187
1188
1189
1190
1191
1192
1193
1194
1195
1196
1197
1198
1199
1200
1201
1202
1203
1204
1205
1206
1207
1208
1209
1210
1211
1212
1213
1214
1215
1216
1217
1218
1219
1220
1221
1222
1223
1224
1225
1226
1227
1228
1229
1230
1231
1232
1233
1234
1235
1236
1237
1238
1239
1240
1241
1242
1243
1244
1245
1246
1247
1248
1249
1250
1251
1252
1253
1254
Mulders and Szendroi Early Association of Prosodic Focus with ‘only’
Importantly, this result may have been influenced by thedifference in expected responses (ES: YES, LS: NO), as in theoriginal Gennari et al. (2005) study. It is possible that it takeslonger (or shorter) to find a falsifying entity in a picture thanit takes to scan the picture and verify that there is no falsifyingentity present. It could also be the case that a negative judgmenttakes longer (or shorter) than a positive one. Moreover, it ispossible that the use of the non-dominant hand to tap a NOresponse introduced a bias in favor of the ES condition. Inorder to control for these factors, we performed Experiment2, where the expected response in both conditions was YES.In Experiment 3, the expected response was NO in bothconditions.
Eye Gaze PatternsThe most important finding from Experiment 1 is that looksstarted diverging across conditions in the predicted way duringthe indirect object time segment: More looks targeted the ‘diver’and the ‘diver’s celery’ in the LS condition than in the EScondition and more looks targeted the ‘fireman’ in the EScondition than in the LS condition. This provides evidence thatparticipants have computed the focal meaning component asearly as the indirect object time segment: so focus informationwas integrated into the semantic parse of the utterance very fast.This is because the observed divergent looks correspond to theparticipants’ attempt at verifying the focal meaning componentof the utterances, which is different in the two conditions [see(10c) and (11c) above].
The looks follow the predictions further at the sentence-finalverb gegeven ‘given’ and after the utterance offset: participants’looks target the ‘fireman’ and the ‘fireman’s celery’ more in theES condition, while looks target the ‘diver’, the ‘diver’s celery’, andsomewhat later also the ‘diver’s corn’ in the LS condition (theeffect on the ‘diver’s corn’ is right at the Bonferroni-correctedalpha level, to be on the conservative side we take it to bemarginally significant). As expected, the effect was longer on the‘diver’s celery’ and the ‘diver’ as these correspond to the actualfalsifying proposition in the LS condition, and it was shorter andless pronounced on the ‘diver’s corn’ which is an entity that is onlypotentially relevant for falsification.
These findings also constitute evidence that the verificationprocess corresponds to the semantics associated with theutterance (see Horn, 1969; Krifka, 1992; Rooth, 1992 anddiscussion above). Participants do not simply look for anoffending object (i.e., ‘diver’s celery’), their looks also target theperson holding that object (i.e., ‘diver’). We take this to mean thatthey are verifying the relevant proposition of the focal meaningcomponent, I didn’t give celery to anyone else, falsified by theproposition I gave celery to the diver.
Our expectation that looks follow the logic of the semanticallydetermined focal meaning component is further supported by therelative absence of looks to irrelevant entities. While participantsdo look at the (potentially) falsifying entities (the ‘fireman’s celery’in the ES condition; the ‘diver’s celery’ in the LS condition),they do not target Gennari et al.’s (2005) ‘contrast entities’, i.e.,the ‘diver’s corn’ in the ES condition. At no auditory segment
are looking proportions to the ‘diver’s corn’ higher in the EScondition than in the LS condition.
Overall, our findings are consistent with early and incrementalfocus identification and association with only, consolidatingearlier results by Gennari et al. (2005), Paterson et al. (2007),and Ito and Speer (2008), and pinpointing the effect to theearliest point in time that participants have the necessaryinformation to compute the meaning components of theutterance, namely the indirect object. No facilitation wasfound for response times. However, this may have been dueto the fact that the two conditions had divergent expectedresponses in Experiment 1. We investigate this issue further inExperiment 2.
As a final point, recall that our research aim is to determinethe earliest point at which participants’ looks give evidence ofdistinguishing the two conditions. In general, in visual-worldeye-tracking, we can observe that fixation proportions on aparticular AoI always grow gradually, eventually reaching apeak, and then gradually diminishing. As a result, any robustdifference we find in a particular sound segment is likely to beimmediately preceded (and followed) by a less robust differencein the preceding (or following) sound segment. As noted above,we found a predicted difference between time spent fixatingon the ‘fireman’ in the ES and the LS conditions, which issignificant (with Bonferroni correction) from the indirect objectsegment onwards. But as Figure 4 shows, the difference inlooks to the ‘fireman’ across the conditions seems to startto grow already during the aan de ‘to the’ segment. At thissegment, the effect is not robust under correction for multiplecomparisons [F1(1,19) = 6.006, p = 0.024, η2
p = 0.240], so
FIGURE 5 | Example of visual stimulus for Experiment 2.
Frontiers in Psychology | www.frontiersin.org 11 March 2016 | Volume 7 | Article 150
fpsyg-07-00150 February 29, 2016 Time: 17:36 # 12
1255
1256
1257
1258
1259
1260
1261
1262
1263
1264
1265
1266
1267
1268
1269
1270
1271
1272
1273
1274
1275
1276
1277
1278
1279
1280
1281
1282
1283
1284
1285
1286
1287
1288
1289
1290
1291
1292
1293
1294
1295
1296
1297
1298
1299
1300
1301
1302
1303
1304
1305
1306
1307
1308
1309
1310
1311
1312
1313
1314
1315
1316
1317
1318
1319
1320
1321
1322
1323
1324
1325
1326
1327
1328
1329
1330
1331
1332
1333
1334
1335
1336
1337
1338
1339
1340
1341
1342
1343
1344
1345
1346
1347
1348
1349
1350
1351
1352
1353
1354
1355
1356
1357
1358
1359
1360
1361
1362
1363
1364
1365
1366
1367
1368
Mulders and Szendroi Early Association of Prosodic Focus with ‘only’
we cannot draw any strong conclusions from it, but it is anindication that the effect may start to happen earlier than weexpected. This is surprising because it suggests that participantsstart looking at the ‘fireman’ before they have actually heardthe noun brandweerman ‘fireman’. We interpret the potentialearly start of the effect as the participants’ anticipating thecontinuation of the utterance to be brandweerman ‘fireman’ atthe point when they have heard Ik heb alleen SELDERIJ aan de. . .‘I only (gave) CELERY to the. . .’. We tested this hypothesis inExperiment 3.
EXPERIMENT 2
Material and MethodsParticipantsTwenty non-dyslexic native Dutch speakers were recruited fromthe UiL OTS participant pool. Participants were unaware ofthe purpose of the experiment, and were paid 5€ for theirparticipation. The mean age of the participants was 24;3 years(range: 19–46); 19 females and 1 male; 17 participants wereright-handed.
FIGURE 6 | Time course of the mean fixation time proportions for each AoI in Experiment 2.
Frontiers in Psychology | www.frontiersin.org 12 March 2016 | Volume 7 | Article 150
fpsyg-07-00150 February 29, 2016 Time: 17:36 # 13
1369
1370
1371
1372
1373
1374
1375
1376
1377
1378
1379
1380
1381
1382
1383
1384
1385
1386
1387
1388
1389
1390
1391
1392
1393
1394
1395
1396
1397
1398
1399
1400
1401
1402
1403
1404
1405
1406
1407
1408
1409
1410
1411
1412
1413
1414
1415
1416
1417
1418
1419
1420
1421
1422
1423
1424
1425
1426
1427
1428
1429
1430
1431
1432
1433
1434
1435
1436
1437
1438
1439
1440
1441
1442
1443
1444
1445
1446
1447
1448
1449
1450
1451
1452
1453
1454
1455
1456
1457
1458
1459
1460
1461
1462
1463
1464
1465
1466
1467
1468
1469
1470
1471
1472
1473
1474
1475
1476
1477
1478
1479
1480
1481
1482
Mulders and Szendroi Early Association of Prosodic Focus with ‘only’
TABLE 3 | Analysis of variance per auditory segment for Experiment 2.
Auditory segment Factor df1 df2 F1 p η2p
Direct object AoI 2.351a 44.671a 32.373 0.000∗ 0.630
Condition 1 19 0.914 0.351 0.046
AoI∗condition 2.869a 54.514a 0.952 0.419 0.048
aan de ‘to the’ AoI 1.874a 35.606a 108.419 0.000∗ 0.851
Condition 1 19 0.215 0.648 0.011
AoI∗condition 2.571a 48.857a 0.551 0.623 0.028
Indirect object AoI 2.468a 46.885a 72.450 0.000∗ 0.792
Condition 1 19 1.061 0.316 0.053
AoI∗condition 3.339a 63.437a 5.486 0.001∗ 0.224
gegeven ‘given’ AoI 2.089a 39.684a 18.029 0.000∗ 0.487
Condition 1 19 0.213 0.649 0.011
AoI∗condition 4 76 9.312 0.000∗ 0.329
0–500 ms after offset AoI 3.050a 57.953a 8.651 0.000∗ 0.313
Condition 1 19 0.014 0.907 0.001
AoI∗condition 3.647a 69.295a 3.946 0.008∗ 0.172
501–1000 ms after offset AoI 2.740a 52.055a 2.690 0.061 0.124
Condition 1 19 0.572 0.459 0.029
AoI∗condition 2.932a 55.700a 1.064 0.371 0.053
aHuynh-Feldt corrected.
TABLE 4 | Proportions of time spent fixating each of the relevant people and objects in ES and LS condition for each auditory segment, and pairwisecomparisons between conditions with Bonferroni corrections applied for Experiment 2.
Auditory segment Person/object Proportion of time looking in ES Proportion of time looking in LS F1(1,19) p η2p
Indirect object Diver’s corn 0.05 0.09 6.777 0.017† 0.263
Fireman 0.44 0.37 8.070 0.010† 0.298
gegeven ‘given’ Diver’s corn 0.05 0.13 17.702 0.000∗ 0.482
Fireman’s celery 0.20 0.12 22.323 0.000∗ 0.540
0–500 ms after offset Diver’s corn 0.04 0.08 9.169 0.007∗ 0.326
MaterialsLike in Experiment 1, 16 items were constructed in twoconditions. The auditory stimuli were identical to the ones usedin Experiment 1. The visual stimuli of Experiment 1 were changedin such a way that the expected responses were YES in bothconditions. In particular, the ‘diver’ had a plate with a ‘corn cob’and an empty plate, while the ‘fireman’ had a plate with a ‘celery’and an empty plate. See Figure 5 for an example.
The 64 unrelated fillers from Experiment 1 were includedalongside the test items. In addition 32 controls involving alleen‘only’ were created, 16 with early stress, 16 with late stress,with half of the items referring to the ‘doctor’. The expectedresponse for the controls was NO, to counterbalance the YES biasintroduced by the test items.
ProcedureThe procedure was identical to that of Experiment 1.
PredictionsWe were interested to see if there was a facilitatory effect of earlystress resulting in shorter response times in the ES conditioncompared to the LS condition. Regarding eye movement patterns,our predictions were the same as in Experiment 1 except that
since both conditions are true in the pictures, there is no falsifyingentity in the picture.
ResultsTwo trials from different experimental participants were removedfrom the response data because the response was given before theindirect object segment.
Number of Correct ResponsesThe percentage of correct responses for the LS conditionwas 100%, for the ES condition 99%. The difference was notsignificant [F1(1,19) = 0.322, p = 0.577, η2
p = 0.017]. Theoverall correct response rate for the experiment as a wholewas 98%.
Response TimeThe overall mean response time from utterance onset for the LScondition was 2843ms, while it was 2868 ms for the ES condition.The difference is not significant [F1(1,19) = 0.147, p = 0.706,η2
p = 0.008).
Eye Gaze PatternsCoding and analysis was identical to that in Experiment 1.
Frontiers in Psychology | www.frontiersin.org 13 March 2016 | Volume 7 | Article 150
fpsyg-07-00150 February 29, 2016 Time: 17:36 # 14
1483
1484
1485
1486
1487
1488
1489
1490
1491
1492
1493
1494
1495
1496
1497
1498
1499
1500
1501
1502
1503
1504
1505
1506
1507
1508
1509
1510
1511
1512
1513
1514
1515
1516
1517
1518
1519
1520
1521
1522
1523
1524
1525
1526
1527
1528
1529
1530
1531
1532
1533
1534
1535
1536
1537
1538
1539
1540
1541
1542
1543
1544
1545
1546
1547
1548
1549
1550
1551
1552
1553
1554
1555
1556
1557
1558
1559
1560
1561
1562
1563
1564
1565
1566
1567
1568
1569
1570
1571
1572
1573
1574
1575
1576
1577
1578
1579
1580
1581
1582
1583
1584
1585
1586
1587
1588
1589
1590
1591
1592
1593
1594
1595
1596
Mulders and Szendroi Early Association of Prosodic Focus with ‘only’
Figure 6 gives the mean proportion of looking time for eachAoI in the auditory segments.
Table 3 gives the analysis of variance for Experiment 2.Like in Experiment 1, we were interested in the interaction
between Visual AoI and Condition, because this would revealany potential effect of the different positioning of focal accent onsentence processing. The interaction between AoI and Conditionwas significant during the indirect object, the verb gegeven ‘given’,and during the first 500 ms after the verb. These are indicatedwith bold in Table 3 for ease of reference. To reveal wherethe actual differences in eye gaze patterns across the conditionslied, we performed pairwise comparisons for these auditorysegments. The significant Bonferroni-corrected results are givenin Table 4.
Looks started diverging across conditions during the indirectobject auditory segment, where more time was spent looking atthe ‘diver’s corn’ in the LS condition, and more time was spentlooking at the ‘fireman’ in the ES condition. On the ‘diver’s corn’the effect continued during the auditory segments correspondingto the verb, and the first 500 ms intervals after utterance offset.During the utterance final verb gegeven ‘given’, there was alsosignificantly more time spent targeting the ‘fireman’s celery’ inthe ES condition.
DiscussionBehavioral MeasuresWe found a high rate of correct responses for both testconditions. We did not find that the early occurrence of stressfacilitated verification, as response times did not differ acrossconditions. See Section “General Discussion” below for more onthis.
Eye Gaze PatternsThe eye gaze patterns were similar to Experiment 1. The lookingpatterns diverged in the expected way: more looks on the‘fireman’ and the ‘fireman’s celery’ in the ES condition and on the‘diver’s corn’ in the LS condition. Perhaps due to the more simplenature of the visual stimulus, the effects are not as sustained overtime as they are in Experiment 1.
Let us now turn to our final experiment, where the expectedresponses were NO in both conditions.
EXPERIMENT 3
IntroductionRecall that in Experiment 1, we found that the difference betweenconditions in the looks targeting the ‘fireman’ seems to startalready before the indirect object was heard. Although we didexpect more looks targeting the ‘fireman’ in the ES condition, wedid not expect this to happen until after the indirect object debrandweerman ‘the fireman’ was actually heard. We believe thatthis increase in looks targeting the ‘fireman’ in the ES condition isanticipatory. Our hypothesis is that in a picture verification task,participants employ an unconscious strategy when performingthe task: they start out with the assumption that the utterance willmatch the picture.
Let us explain this in more detail using our actual example.Take the moment when participants hear the first half of theutterance Ik heb alleen SELDERIJ . . . ‘I have only CELERY . . .’in a setting where the fireman is the only person that has onlycelery, as in Figure 1 from Experiment 1. At this point, thereis only one continuation of this utterance that would make thesentence true in the picture, namely the actual continuation (i.e.,. . . aan de brandweerman gegeven. ‘. . .to the fireman given.’).Any alternative continuation (e.g., referring to the ‘diver’) wouldmake the sentence false. Given that there is only one way asentence can be true in a picture and there are many ways itcould be false, it would make sense for the listener to adopt acognitive strategy that assumes that the sentence is true untilproven wrong. In contrast, in the LS condition, when participantshear Ik heb alleen selderij . . . ‘I have only celery . . .’ in a contextof a picture where both the fireman and the diver has celery, as inFigure 1, there is no continuation of the utterance that can makethe sentence true. So, there is no anticipatory advantage in thiscondition. This has the effect that there is a stronger tendencyfor participants’ eye gaze to already target the ‘fireman’ in the EScondition than in the LS condition, even before they have heardthe word brandweerman ‘fireman’ (i.e., during the aan de ‘to the’auditory segment).6 In short, we speculate that in a bi-modalverification task, participants anticipate the continuation of thesentence to be such that it makes the utterance true in the picture.
This is in line with findings of Altmann and Kamide (1999)and Kamide et al. (2005) (see also Ito and Speer, 2008). Theseauthors found that while listening to utterances presented tothem in the visual-world paradigm, participants can sometimesanticipate certain semantic properties of forthcoming lexicalitems based on the lexical items they have already heard. Inparticular, they tested utterances like The boy will eat the cakepresented in a visual setting with a boy, a cake, a toy car, a toy trainand a ball. They found that participants’ eye movements targetedthe cake already before the object noun phrase, so when they onlyheard The boy will eat. . .. Altmann and Kamide (1999) showedthat this was because the verb eat places a semantic restriction onthe object noun phrase and the cake was the only edible object inthe picture.
The goal of Experiment 3 was to investigate our anticipatorylook hypothesis.
Material and MethodsParticipantsTwenty-three non-dyslexic native speakers of Dutch wererecruited from the UiL OTS participant pool.
Data of three participants were discarded prior to analysis;one participant did not receive adequate instruction prior tothe experiment and reread the instruction sheet repeatedlyduring the experiment; one experimental run suffered from anunresponsive button box; and one participant was intimatelyfamiliar with linguistic theories on stress shift. The remaining
6Note that in Experiment 2, the continuation of the sentence with brandweerman‘fireman’ matches the picture in both the ES and LS condition, so our hypothesispredicts the same anticipatory looks targeting the ‘fireman’ in both conditions,with no significant difference between conditions. As can be seen in Figure 4, thisexpectation was met.
Frontiers in Psychology | www.frontiersin.org 14 March 2016 | Volume 7 | Article 150
fpsyg-07-00150 February 29, 2016 Time: 17:36 # 15
1597
1598
1599
1600
1601
1602
1603
1604
1605
1606
1607
1608
1609
1610
1611
1612
1613
1614
1615
1616
1617
1618
1619
1620
1621
1622
1623
1624
1625
1626
1627
1628
1629
1630
1631
1632
1633
1634
1635
1636
1637
1638
1639
1640
1641
1642
1643
1644
1645
1646
1647
1648
1649
1650
1651
1652
1653
1654
1655
1656
1657
1658
1659
1660
1661
1662
1663
1664
1665
1666
1667
1668
1669
1670
1671
1672
1673
1674
1675
1676
1677
1678
1679
1680
1681
1682
1683
1684
1685
1686
1687
1688
1689
1690
1691
1692
1693
1694
1695
1696
1697
1698
1699
1700
1701
1702
1703
1704
1705
1706
1707
1708
1709
1710
Mulders and Szendroi Early Association of Prosodic Focus with ‘only’
participants were unaware of the purpose of the experiment,and were paid 5€ for their participation. The mean age of the20 remaining participants was 22;8 years, ranging from 17 to27 years; 14 females and 6 males; 18 participants were right-handed.
MaterialsLike in Experiments 1 and 2, 16 items were constructed in twoconditions. The auditory stimuli were identical to the ones usedin Experiments 1 and 2. The visual stimuli were changed in sucha way that the expected responses were NO in both conditions.In particular, the ‘fireman’ had a ‘celery’ and a ‘corn cob’, while the‘diver’ had a ‘celery’. See Figure 7 for an example.
The 64 unrelated fillers from Experiments 1 and 2 wereincluded alongside the test items. In addition 32 controls werecreated, 16 with early stress, 16 with late stress; half of the itemsmentioning the ‘doctor’. The expected response for the controlswas YES, to counterbalance the test items.
ProcedureThe procedure was identical to that of Experiments 1 and 2.
PredictionsThe experiment was designed to test our hypothesis that thetrend toward an early increase in looks targeting the ‘fireman’ inExperiment 1 was anticipatory in the following sense. Participantshear Ik heb alleen SELDERIJ. . . ‘I have only CELERY. . .’ andanticipate that the utterance will continue in such a way thatit matches the picture. In Experiment 3 the visual stimuluswas changed in such a way that the only person that has
FIGURE 7 | Example of visual stimulus for Experiment 3.
only celery is the ‘diver’. See Figure 7. The audio stimuliwere identical to that of Experiment 1. Thus, our anticipationhypothesis predicts that participants will look more at ‘thediver’ during the aan de ‘to the’ auditory segment in the EScondition. This is because the ‘diver’ has only ‘celery’. But oncethey hear the indirect object de brandweerman ‘the fireman’,their looks are expected to shift to the ‘fireman’. So, it isexpected that in Experiment 3 the anticipatory strategy ‘tricks’participants.
In addition, we expected that the findings of Experiments 1and 2 about divergent looks between the two conditions wouldbe replicated, except potentially, due to the potential hinderingeffect of the anticipatory looks, somewhat delayed.
ResultsEleven trials (six experimental) were removed from analysis ofthe response data because the response had already been givenbefore the onset of the indirect object (six experimental and fourfiller items) or utterance onset (one filler).
Number of Correct ResponsesThe percentage of correct responses for the LS condition was97%, for the ES condition 98%. The difference was not significant[F1(1,19) = 1.353, p = 0.259, η2
p = 0.066]. The overall correctresponse rate for Experiment 3 was 97%.
Response TimeThe overall mean response time from utterance onset for the LScondition was 2875 ms, while it was 2909 ms for the ES condition.The difference is not significant [F1(1,19) = 0.418, p = 0.526η2
p = 0.022).
Eye Gaze PatternsCoding and analysis was identical to that in Experiments 1 and 2.
Table 5 gives the analysis of variance for Experiment 3. Like inExperiments 1 and 2, we focus only on the significant interactionsbetween Visual AoI and Condition, as those are the relevantfindings for our research question. For ease of reference, theseare given in bold.
We find a significant interaction between AoI and Conditionin the aan de ‘to the’ auditory segment, during the indirect object,during the verb gegeven ‘given’, and in the two auditory segmentsafter utterance offset. Like in Experiments 1 and 2, the significantBonferroni-corrected pairwise comparisons for these auditorysegments are given in Table 6.
Figure 8 gives the mean proportion of looks for each AoI ineach auditory segment.
Discussion of Experiment 3 and GeneralDiscussionBehavioral MeasuresOverall, in none of the experiments was there any facilitatoryeffect of the early occurrence of stress in terms of shorterresponse times or a higher accuracy rate for the ES condition.We think that this is because even though participants mayuse the earliness of stress to anticipate the continuationof the utterance, they still have to wait until the sentence
Frontiers in Psychology | www.frontiersin.org 15 March 2016 | Volume 7 | Article 150
fpsyg-07-00150 February 29, 2016 Time: 17:36 # 16
1711
1712
1713
1714
1715
1716
1717
1718
1719
1720
1721
1722
1723
1724
1725
1726
1727
1728
1729
1730
1731
1732
1733
1734
1735
1736
1737
1738
1739
1740
1741
1742
1743
1744
1745
1746
1747
1748
1749
1750
1751
1752
1753
1754
1755
1756
1757
1758
1759
1760
1761
1762
1763
1764
1765
1766
1767
1768
1769
1770
1771
1772
1773
1774
1775
1776
1777
1778
1779
1780
1781
1782
1783
1784
1785
1786
1787
1788
1789
1790
1791
1792
1793
1794
1795
1796
1797
1798
1799
1800
1801
1802
1803
1804
1805
1806
1807
1808
1809
1810
1811
1812
1813
1814
1815
1816
1817
1818
1819
1820
1821
1822
1823
1824
Mulders and Szendroi Early Association of Prosodic Focus with ‘only’
TABLE 5 | Analysis of variance per auditory segment for Experiment 3.
Auditory segment Factor df1 df2 F1 p η2p
Direct object AoI 5 95 7.429 0.000∗ 0.281
Condition 1 19 0.105 0.750 0.005
AoI∗condition 5 95 1.273 0.282 0.063
aan de ‘to the’ AoI 5 95 8.342 0.000∗ 0.305
Condition 1 19 3.147 0.092 0.142
AoI∗condition 5 95 3.434 0.007∗ 0.153
Indirect object AoI 3.348a 63.620a 34.894 0.000∗ 0.647
Condition 1 19 5.637 0.028∗ 0.229
AoI∗condition 5 95 3.894 0.003∗ 0.170
gegeven ‘given’ AoI 5 95 43.086 0.000∗ 0.694
Condition 1 19 20.789 0.000∗ 0.522
AoI∗condition 5 95 16.929 0.000∗ 0.471
0–500 ms after offset AoI 3.856a 73.263a 16.362 0.000∗ 0.463
Condition 1 19 12.266 0.002∗ 0.392
AoI∗condition 4.178a 79.389a 11.201 0.000∗ 0.371
501–1000 ms after offset AoI 2.652a 50.383a 6.628 0.000∗ 0.259
Condition 1 19 0.152 0.701 0.008
AoI∗condition 3.310a 62.893a 5.869 0.000∗ 0.236
aHuynh-Feldt corrected.
TABLE 6 | Proportions of time spent fixating each of the relevant people and objects in ES and LS condition for each auditory segment, and pairwisecomparisons between conditions with Bonferroni corrections applied for Experiment 3.
Auditory segment Person/object Proportion of time looking in ES Proportion of time looking in LS F1 (1,19) p η2p
aan de ‘to the’ Diver 0.27 0.18 7.518 0.013† 0.284
Indirect object Fireman’s celery 0.07 0.11 8.288 0.010† 0.304
gegeven ‘given’ Diver’s celery 0.04 0.11 15.733 0.001∗ 0.453
Fireman 0.33 0.20 28.690 0.000∗ 0.602
Fireman’s corn 0.18 0.09 21.543 0.000∗ 0.531
0–500 ms after offset Diver’s celery 0.02 0.07 19.961 0.000∗ 0.512
Fireman 0.15 0.07 13.266 0.002∗ 0.411
Fireman’s corn 0.13 0.06 16.140 0.001∗ 0.459
501–1000 ms after offset Diver’s celery 0.01 0.04 14.628 0.001∗ 0.435
is actually finished until they can establish the meaningcomponents based on the actual continuation. So, overall, evenif accentual information is presented earlier in the ES condition,leading to early identification of focus, this cannot facilitatecomputation of the meaning components associated with onlyoverall, due to the propositional nature of these meaningcomponents.
Eye Tracking PatternsThe visual stimulus in Experiment 3 was designed to test ourhypothesis that participants anticipate the continuation of theutterance when they hear Ik heb alleen SELDERIJ aan de. . . ‘Ihave only CELERY to the. . .’ in the ES condition. If the sentencewould continue with duiker ‘diver’, it would match the picture;and in the ES condition we do indeed find marginally more timeis being spent looking at the ‘diver’ than in the LS condition, rightbefore the indirect object is heard. Given that this effect is the startof a fixation curve (i.e., gradual growth, followed by robust peak,followed by gradual diminishing effect), we did not expect a more
robust effect at this point. So, we can confirm our anticipationhypothesis.
But, the actual continuation of the utterance turns out tobe brandweerman ‘fireman’. So, the utterance ends up beingfalse. As predicted (and already found in Experiments 1 and2), once the indirect object has been heard, looks shift tothe ‘fireman’ and his possessions in the ES condition andto the ‘diver’s celery’ in the LS condition. The effects aresomewhat delayed compared to Experiment 1, presumably dueto the hindering effect of the anticipation strategy. Duringthe sentence final verb, there are more looks targeting the‘fireman’ and the ‘fireman’s corn’ in the ES condition andmore looks targeting the ‘diver’s celery’ in the LS condition.We also expected that at this point, looks to the ‘diver’ wouldbe higher in the LS condition than in the ES condition,—the direct opposite of our expectation for the aan de ‘to the’auditory segment. Looks to the ‘diver’ are indeed numericallyhigher in the LS condition during the verb gegeven ‘given’but the effect does not reach significance [F1(1,19) = 6.576,
Frontiers in Psychology | www.frontiersin.org 16 March 2016 | Volume 7 | Article 150
fpsyg-07-00150 February 29, 2016 Time: 17:36 # 17
1825
1826
1827
1828
1829
1830
1831
1832
1833
1834
1835
1836
1837
1838
1839
1840
1841
1842
1843
1844
1845
1846
1847
1848
1849
1850
1851
1852
1853
1854
1855
1856
1857
1858
1859
1860
1861
1862
1863
1864
1865
1866
1867
1868
1869
1870
1871
1872
1873
1874
1875
1876
1877
1878
1879
1880
1881
1882
1883
1884
1885
1886
1887
1888
1889
1890
1891
1892
1893
1894
1895
1896
1897
1898
1899
1900
1901
1902
1903
1904
1905
1906
1907
1908
1909
1910
1911
1912
1913
1914
1915
1916
1917
1918
1919
1920
1921
1922
1923
1924
1925
1926
1927
1928
1929
1930
1931
1932
1933
1934
1935
1936
1937
1938
Mulders and Szendroi Early Association of Prosodic Focus with ‘only’
FIGURE 8 | Time course of the mean fixation time proportions for each AoI in Experiment 3.
p = 0.019, η2p = 0.257). The expected effects on the ‘fireman’,
the ‘fireman’s corn’ and the ‘diver’s celery’ continue throughoutthe first 500 ms interval after utterance offset. For the‘diver’s celery’, the effect is still present during the second500 ms after utterance offset. In addition, a marginally higherproportion of looks targetted the ‘fireman’s celery’ in the LScondition during the indirect object auditory segment, whichwas unexpected. We interpret this as the participants verifyingthe non-focal meaning component (i.e., that the ‘fireman’ has‘celery’), which perhaps does not occur in the ES condition atthis point due to the hindering effect of the mis-anticipatedcontinuation.
Overall, we found that sentence verification starts early. Infact, perhaps unexpectedly, it starts already before the wholeutterance is heard. Participants anticipate the continuation ofthe utterance assuming that the utterance will turn out tomatch the picture. Crucially, prosodic focus on the directobject was found to be relevant for guiding anticipatory looksalready at the next sound segment, during aan de ‘to the’ (seeExperiment 3). This gives evidence of incremental prosodic focusprocessing.
In all three experiments, we found that utterance verificationproceeds according to the semantics of only-sentences (Horn,1969; Krifka, 1992; Rooth, 1992). Participants’ looks robustly
Frontiers in Psychology | www.frontiersin.org 17 March 2016 | Volume 7 | Article 150
fpsyg-07-00150 February 29, 2016 Time: 17:36 # 18
1939
1940
1941
1942
1943
1944
1945
1946
1947
1948
1949
1950
1951
1952
1953
1954
1955
1956
1957
1958
1959
1960
1961
1962
1963
1964
1965
1966
1967
1968
1969
1970
1971
1972
1973
1974
1975
1976
1977
1978
1979
1980
1981
1982
1983
1984
1985
1986
1987
1988
1989
1990
1991
1992
1993
1994
1995
1996
1997
1998
1999
2000
2001
2002
2003
2004
2005
2006
2007
2008
2009
2010
2011
2012
2013
2014
2015
2016
2017
2018
2019
2020
2021
2022
2023
2024
2025
2026
2027
2028
2029
2030
2031
2032
2033
2034
2035
2036
2037
2038
2039
2040
2041
2042
2043
2044
2045
2046
2047
2048
2049
2050
2051
2052
Mulders and Szendroi Early Association of Prosodic Focus with ‘only’
diverge already during the sentence final verb gegeven ‘given’in the two conditions in all three experiments, as they verifythe different focal meaning components associated with earlyand late occurrence of stress. In Experiment 1, robust effectswere already found during the indirect object. So, we foundevidence that participants’ looks not only target the falsifyingentity in the picture, but rather the falsifying proposition wasestablished. This provides support for the psychological reality ofproposition-based semantics for prosodic focus association withonly.
CONCLUSION
Our results show incremental focus processing and thus fall inline with earlier results (Gennari et al., 2005; Paterson et al.,2007). Investigating the time course of looks accompanyingthe computation of only-sentences allowed us to pinpointthe time course of the semantic processing associated withfocal differences that are marked prosodically. We found thatpeople process prosodic focus immediately: there is evidenceof participants verifying the focal meaning component alreadyduring the indirect object. We also found that participants makeanticipatory looks in this picture verification task taking intoaccount the prosodic focus of the utterance, providing furtherevidence of incremental focus computation at the earliest possiblepoint, at the point where the direct object with or withoutprosodic focus is heard.
AUTHOR CONTRIBUTIONS
KS is responsible for the original design. IM determined theprocedure and was responsible for creating the materials, andimplementing and carrying out the experiment. IM carried
out the data analysis and statistical calculations. KS and IMinterpreted the data together and co-wrote the discussions.
FUNDING
KS wishes to acknowledge financial help from the Dutch ScienceFoundation (NWO) [VENI grant number 275-70-011], theGerman Research Foundation (DFG) [SFB 632], and to expressher gratitude to Wolfson College, Oxford for an academicvisitorship while the paper was prepared for submission.
ACKNOWLEDGMENTS
We thank Ignace Hooge for help with designing the visualstimuli. We thank Tom Fritzsche, Judit Gervain, Pim Mak, andShravan Vasishth for many useful comments about the contentof our paper, and Laura Boeschoten, Kirsten Passoa-Schutter,Chris van Run, Maarten Duijndam, and Arnout Koornneef forhelp on understanding the statistical concepts involved. Wethank the students who helped create the stimuli and run theexperiments: Stavroula Alexandropoulou, Desirée Capel, JanWillem Chevalking, Moreno Coco, Anja Goldschmidt, AlexiaGuerra Rivera, Loes Koring, Sophia Manika, Yueru Ni, AndryPanayiotou, Myrto Pantazi, Will Schuerman, and Elli Tourtouri.Last but not least, we dedicate this paper to the memory of TanyaReinhart, who inspired us to carry out this work.
SUPPLEMENTARY MATERIAL
The Supplementary Material for this article can be foundonline at: http://journal.frontiersin.org/article/10.3389/fpsyg.2016.00150
REFERENCESAltmann, G. T., and Kamide, Y. (1999). Incremental interpretation at verbs:
restricting the domain of subsequent reference. Cognition 73, 247–264. doi:10.1016/S0010-0277(99)00059-1
Altmann, G. T. M., and Kamide, Y. (2004). “Now you see it, now you don’t:mediating the mapping between language and the visual world,” in TheInterface of Language, Vision, and Action: Eye Movements and the VisualWorld, eds J. M. Henderson and F. Ferreira (New York, NY: Psychology Press),347–368.
Arnold, J. E. (2008). {THE} {BACON} not the bacon: how children and adultsunderstand accented and unaccented noun phrases. Cognition 108, 69–99. doi:10.1016/j.cognition.2008.01.001
Boersma, P., and Weenink, D. (2006). Praat: Doing Phonetics by Computer.Available at: http://www.praat.org/
Chomsky, N. (1971). “Deep structure, surface structure, and semanticinterpretation,” in Semantics: An Interdisciplinary Reader in Philosophy,Linguistics and Psychology, eds L. A. Jakobovits and D. D. Steinberg (Cambridge:Cambridge University Press), 183–216.
Clifton, C. Jr., Bock, J., and Radó, J. (2000). “Chapter 23 – effects of the focusparticle only and intrinsic contrast on comprehension of reduced relativeclauses,” in Reading as a Perceptual Process, eds A. Kennedy, R. Radach, D.Heller, and J. Pynte (Oxford: North-Holland), 591–619.
Crain, S., Ni, W., and Conway, L. (1994). “Learning, parsing, and modularity,” inPerspectives on Sentence Processing, eds C. Clifton, L. Frazier, and K. Rayner(Hillsdale, NJ: Lawrence Erlbaum Associates), 443–467.
Crain, S., and Steedman, M. (1985). “On not being led up the garden path: the useof context by the psychological syntax processor,” in Natural Language Parsing:Psychological, Computational, and Theoretical Perspectives, eds D. R. Dowty,L. Karttunen, and A. M. Zwicky (Cambridge: Cambridge University Press),320–358.
Eberhard, K., Spivey-Knowlton, M., Sedivy, J., and Tanenhaus, M. (1995). Eyemovements as a window into real-time spoken language comprehensionin natural contexts. J. Psycholinguist. Res. 24, 409–436. doi: 10.1007/BF02143160
Filik, R., Paterson, K. B., and Liversedge, S. P. (2005). Parsing with focusparticles in context: eye movements during the processing of relativeclause ambiguities. J. Mem. Lang. 53, 473–495. doi: 10.1016/j.jml.2005.07.004
Gennari, S., Meroni, L., and Crain, S. (2005). “Rapid relief of stress in dealing withambiguity,” in Approaches to Studying World-Situated Language Use: Bridgingthe Language-as-Product and Language-as-Action Traditions, eds J. C. Trueswelland M. K. Tanenhaus (Cambridge, MA: MIT Press), 245–259.
Horn, L. R. (1969). “A presuppositional analysis of only and even,” in Proceedingsof the Annual Meeting of the Chicago Linguistics Society, Vol. 5, (Chicago, IL:University of Chicago), 97–108.
Frontiers in Psychology | www.frontiersin.org 18 March 2016 | Volume 7 | Article 150
fpsyg-07-00150 February 29, 2016 Time: 17:36 # 19
2053
2054
2055
2056
2057
2058
2059
2060
2061
2062
2063
2064
2065
2066
2067
2068
2069
2070
2071
2072
2073
2074
2075
2076
2077
2078
2079
2080
2081
2082
2083
2084
2085
2086
2087
2088
2089
2090
2091
2092
2093
2094
2095
2096
2097
2098
2099
2100
2101
2102
2103
2104
2105
2106
2107
2108
2109
2110
2111
2112
2113
2114
2115
2116
2117
2118
2119
2120
2121
2122
2123
2124
2125
2126
2127
2128
2129
2130
2131
2132
2133
2134
2135
2136
2137
2138
2139
2140
2141
2142
2143
2144
2145
2146
2147
2148
2149
2150
2151
2152
2153
2154
2155
2156
2157
2158
2159
2160
2161
2162
2163
2164
2165
2166
Mulders and Szendroi Early Association of Prosodic Focus with ‘only’
Ito, K., and Speer, S. R. (2008). Anticipatory effects of intonation: eyemovements during instructed visual search. J. Mem. Lang. 58, 541–573. doi:10.1016/j.jml.2007.06.013
Ito, K., and Speer, S. R. (2011). “Semantically-independent but contextually-dependent interpretation of contrastive accent,” in Prosodic Categories:Production, Perception and Comprehension, eds S. Frota, G. Elordieta, and P.Prieto (Dordrecht: Springer Netherlands), 69–92.
Kamide, Y., Altmann, G. T., and Heywood, S. (2005). “The time courseof constraint application during sentence processing in visual contexts:anticipatory eye movements in English and Japanese,” in Approaches to StudyingWorld-Situated Language Use: Bridging the Language-as-Product and Language-as-Action Traditions, eds J. C. Trueswell and M. K. Tanenhaus (Cambridge, MA:MIT Press), 229–243.
Kim, C. (2008). “Processing presupposition: verifying sentences with ’Only’,”in Proceedings of the Penn Linguistics Colloquium, Penn Working Papers inLinguistics, eds J. Tauberer, A. Eliam, and L. MacKenzie, 14, 213–226. Availableat: http://repository.upenn.edu/pwpl/vol14/iss1/17
Krifka, M. (1992). “A compositional semantics for multiple focus constructions,”in Informationsstruktur und Grammatik, ed. J. Jacobs (Opladen: WestdeutscherVerlag), 17–53.
Liversedge, S. P., Paterson, K. B., and Clayes, E. L. (2002). The influence of only onsyntactic processing of “long” relative clause sentences. Q. J. Exp. Psychol. A 55,225–240. doi: 10.1080/02724980143000253
Paterson, K. B., Liversedge, S. P., Filik, R., Juhasz, B. J., White, S. J., andRayner, K. (2007). Focus identification during sentence comprehension:evidence from eye movements. Q. J. Exp. Psychol. 60, 1423–1445. doi:10.1080/17470210601100563
Paterson, K. B., Liversedge, S. P., and Underwood, G. (1999). The influence of focusoperators on syntactic processing of short relative clause sentences. Q. J. Exp.Psychol. A 52, 717–737. doi: 10.1080/713755827
Rooth, M. (1992). A theory of focus interpretation. Nat. Lang. Semantics 1, 75–116.doi: 10.1007/BF02342617
Sauermann, A., Filik, R., and Paterson, K. B. (2013). Processing contextual andlexical cues to focus: Evidence from eye movements in reading. Lang. Cogn.Process. 28, 875–903. doi: 10.1080/01690965.2012.668197
Sedivy, J. C. (2002). Invoking discourse-based contrast sets and resolving syntacticambiguities. J. Mem. Lang. 46, 341–370. doi: 10.1006/jmla.2001.2812
Veenker, T. (2005). Flexible Experiment Program. Utrecht: Utrecht University.
Conflict of Interest Statement: The authors declare that the research wasconducted in the absence of any commercial or financial relationships that couldbe construed as a potential conflict of interest.
The reviewer BF and handling Editor declared a current collaboration and thehandling Editor states that the process nevertheless met the standards of a fair andobjective review.
Copyright © 2016 Mulders and Szendroi. This is an open-access article distributedunder the terms of the Creative Commons Attribution License (CC BY). Theuse, distribution or reproduction in other forums is permitted, provided theoriginal author(s) or licensor are credited and that the original publication inthis journal is cited, in accordance with accepted academic practice. No use,distribution or reproduction is permitted which does not comply with theseterms.
Frontiers in Psychology | www.frontiersin.org 19 March 2016 | Volume 7 | Article 150