Eye Movements and Scene Perception: Memory for Things …

14
Copyright 2002 Psychonomic Society, Inc. 882 Perception & Psychophysics 2002, 64 (6), 882-895 Consider walking into a room that you have never been in before. The room contains more information than you can perceive in a single glance, so you direct your eyes, via rapid movements called saccades, so as to fixate objects around you. Each eye movement causes visual information to be swept across your retinas, producing a blur or a smear that is not perceived, because of saccadic suppression (see Matin, 1974, for a review). Because of saccadic suppres- sion, visual information about the room is registered in iso- lated glimpses that are separated in time. Furthermore, the contents of these isolated glimpses are not identical, be- cause the retinal positions of the objects in the room change from one fixation to the next. Despite this, you perceive the room as unified, stable, and continuous, with objects main- taining their positions in space (see Bridgeman, van der Heijden, & Velichkovsky, 1994, for a review). There is no feeling of “starting anew” with each fixation; rather, you develop an on-line mental representation, or situation model (De Graef, 1992; cf. Van Dijk & Kintsch, 1983), of the room that describes where you are, what objects are present, what they look like, and where they are located. Or do you? In truth, surprisingly little is known about the nature of the mental representation that is built up across successive eye movements. Intuitively, it seems very detailed, but considerable research indicates that it is not. For example, people are unable to integrate visual pat- terns viewed in successive eye fixations so as to perceive some composite pattern (e.g., Bridgeman & Mayer, 1983; Irwin, Brown, & Sun, 1988; Irwin, Yantis, & Jonides, 1983; O’Regan & Levy-Schoen, 1983; Rayner & Pollat- sek, 1983). Furthermore, stimulus displacements during saccadic eye movements are frequently undetected unless the displacements are large (e.g., Bridgeman, Hendry, & Stark, 1975), and changes in visual patterns are difficult to detect unless the patterns are simple (e.g., Irwin, 1991). Similarly, changing the visual characteristics of words (by varying letter case, for example) during eye movements has no effect on reading (e.g., McConkie & Zola, 1979) or word naming (e.g., Rayner, McConkie, & Zola, 1980), and changes in the spatial positions or visual properties of pictures during eye movements are frequently undetected (e.g., Currie, McConkie, Carlson-Radvansky, & Irwin, 2000; Grimes, 1996; Henderson, 1997; Henderson & Hollingworth, 1999b; Hollingworth & Henderson, 2000, 2002; Hollingworth, Williams, & Henderson, 2001; Mc- Conkie & Currie, 1996) and have little or no effect on pic- ture naming (e.g., Pollatsek, Rayner, & Henderson, 1990). This insensitivity to change is not restricted to changes made during eye movements, however, but occurs also when a blank screen is interposed between an original and a changed display (e.g., Blackmore, Brelstaff, Nel- son, & Troscianko, 1995; Pashler, 1988; Phillips, 1974; Rensink, O’Regan, & Clark, 1997; Simons, 1996), when film cuts occur in motion pictures (e.g., Hochberg, 1986; Levin & Simons, 1997), or when one’s view of the world This research was supported by National Science Foundation Grant SBR 96-15988 to D.E.I. We thank Bruce Bridgeman, Neal Cohen, Heiner Deubel, Peter De Graef, Albrecht Inhoff, Art Kramer, Gordon Logan, Lester Loschky, George McConkie, and Jennifer Ryan for helpful com- ments on the research, Gary Wolverton for programming assistance, and Emily Kyro for assistance with data collection. Correspondence con- cerning this article should be addressed to D. E. Irwin, Department of Psychology, University of Illinois at Urbana-Champaign, 603 East Daniel St., Champaign, IL 61820 (e-mail: [email protected]). Eye movements and scene perception: Memory for things observed DAVID E. IRWIN University of Illinois at Urbana-Champaign, Champaign, Illinois and GREGORY J. ZELINSKY State University of New York, Stony Brook, New York In this study, we examined the characteristics of on-line scene representations, using a partial-report procedure. Subjects inspected a simple scene containing seven objects for 1, 3, 5, 9, or 15 fixations; shortly after scene offset, a marker cued one scene location for report. Consistent with previous re- search, the results indicated that scene representations are relatively sparse; even after 15 fixations on a scene, the subjects remembered the position/identity pairings for only about 78% of the objects in the scene, or the equivalent of about five objects-worth of information. Report of the last three objects that were foveated and of the object about to be foveated was very accurate, however, suggesting that re- cently attended information in a scene is represented quite well. Information about the scene appeared to accumulate over multiple fixations, but the capacity of the on-line scene representation appeared to be limited to about five items. Implications for recent theories of scene representation are discussed.

Transcript of Eye Movements and Scene Perception: Memory for Things …

Copyright 2002 Psychonomic Society Inc 882

Perception amp Psychophysics2002 64 (6) 882-895

Consider walking into a room that you have never beenin before The room contains more information than youcan perceive in a single glance so you direct your eyes viarapid movements called saccades so as to fixate objectsaround you Each eye movement causes visual informationto be swept across your retinas producing a blur or a smearthat is not perceived because of saccadic suppression (seeMatin 1974 for a review) Because of saccadic suppres-sion visual information about the room is registered in iso-lated glimpses that are separated in time Furthermore thecontents of these isolated glimpses are not identical be-cause the retinal positions of the objects in the room changefrom one fixation to the next Despite this you perceive theroom as unified stable and continuous with objects main-taining their positions in space (see Bridgeman van derHeijden amp Velichkovsky 1994 for a review) There is nofeeling of ldquostarting anewrdquo with each fixation rather youdevelop an on-line mental representation or situationmodel (De Graef 1992 cf Van Dijk amp Kintsch 1983) ofthe room that describes where you are what objects arepresent what they look like and where they are located

Or do you In truth surprisingly little is known aboutthe nature of the mental representation that is built up

across successive eye movements Intuitively it seemsvery detailed but considerable research indicates that it isnot For example people are unable to integrate visual pat-terns viewed in successive eye fixations so as to perceivesome composite pattern (eg Bridgeman amp Mayer 1983Irwin Brown amp Sun 1988 Irwin Yantis amp Jonides1983 OrsquoRegan amp Levy-Schoen 1983 Rayner amp Pollat-sek 1983) Furthermore stimulus displacements duringsaccadic eye movements are frequently undetected unlessthe displacements are large (eg Bridgeman Hendry ampStark 1975) and changes in visual patterns are difficultto detect unless the patterns are simple (eg Irwin 1991)Similarly changing the visual characteristics of words (byvarying letter case for example) during eye movementshas no effect on reading (eg McConkie amp Zola 1979)or word naming (eg Rayner McConkie amp Zola 1980)and changes in the spatial positions or visual properties ofpictures during eye movements are frequently undetected(eg Currie McConkie Carlson-Radvansky amp Irwin2000 Grimes 1996 Henderson 1997 Henderson ampHollingworth 1999b Hollingworth amp Henderson 20002002 Hollingworth Williams amp Henderson 2001 Mc-Conkie amp Currie 1996) and have little or no effect on pic-ture naming (eg Pollatsek Rayner amp Henderson 1990)This insensitivity to change is not restricted to changesmade during eye movements however but occurs alsowhen a blank screen is interposed between an originaland a changed display (eg Blackmore Brelstaff Nel-son amp Troscianko 1995 Pashler 1988 Phillips 1974Rensink OrsquoRegan amp Clark 1997 Simons 1996) whenfilm cuts occur in motion pictures (eg Hochberg 1986Levin amp Simons 1997) or when onersquos view of the world

This research was supported by National Science Foundation GrantSBR 96-15988 to DEI We thank Bruce Bridgeman Neal Cohen HeinerDeubel Peter De Graef Albrecht Inhoff Art Kramer Gordon LoganLester Loschky George McConkie and Jennifer Ryan for helpful com-ments on the research Gary Wolverton for programming assistance andEmily Kyro for assistance with data collection Correspondence con-cerning this article should be addressed to D E Irwin Department ofPsychology University of Illinois at Urbana-Champaign 603 East DanielSt Champaign IL 61820 (e-mail dirwinspsychuiucedu)

Eye movements and scene perception Memory for things observed

DAVID E IRWINUniversity of Illinois at Urbana-Champaign Champaign Illinois

and

GREGORY J ZELINSKYState University of New York Stony Brook New York

In this study we examined the characteristics of on-line scene representations using a partial-reportprocedure Subjects inspected a simple scene containing seven objects for 1 3 5 9 or 15 fixationsshortly after scene offset a marker cued one scene location for report Consistent with previous re-search the results indicated that scene representations are relatively sparse even after 15 fixations ona scene the subjects remembered the positionidentity pairings for only about 78 of the objects in thescene or the equivalent of about five objects-worth of information Report of the last three objects thatwere foveated and of the object about to be foveated was very accurate however suggesting that re-cently attended information in a scene is represented quite well Information about the scene appearedto accumulate over multiple fixations but the capacity of the on-line scene representation appeared tobe limited to about five items Implications for recent theories of scene representation are discussed

EYE MOVEMENTS AND MEMORY 883

is interrupted by an occluding object (Simons amp Levin1998) In short people are strikingly bad at detectingchanges in the visual world whenever low-level motioncues are eliminated by eye movements flicker or otherdisruptive visual events (see Simons amp Levin 1997 for areview) These findings indicate that people do not haveunlimited access to a richly detailed mental representa-tion of a scene otherwise these changes would be quitenoticeable

So what is contained in onersquos on-line representation ofa scene It seems clear that some information is repre-sented we are not startled every time we move our eyesand take in a new view of the world for example Thepurpose of the present research was to begin to deter-mine the characteristics of on-line scene representationsIn particular we investigated how much people remem-ber from a simple scene what factors influence theirmemory and how their on-line scene representationchanges as a function of the number of fixations on thescene

To address the nature of on-line scene representationsa partial-report technique (Averbach amp Coriell 1961)was used to assess peoplersquos memory for pictures of sim-ple scenes The scenes always used a babyrsquos crib as abackground and contained seven objects arrayed inseven fixed locations in the crib The subjects began eachtrial by fixating a plus sign in the bottom center of thecrib scene They were instructed to inspect the scene andto try to remember which objects were located whereThe subjectrsquos eye position was monitored with an eye-tracker and during some critical saccade (the 1st the3rd the 5th the 9th or the 15th) the scene was eraseda delay ensued until the subjectrsquos eye settled into its newposition and then a partial-report probe a bar markerwas presented near one of the previously occupied criblocations The subjectrsquos task was to report the object thathad appeared in the probed position

METHOD

SubjectsTwelve observers participated in this experiment Six subjects

completed a version of the experiment in which 1 3 or 5 fixationswere allowed on the scene whereas the other 6 subjects completeda version of the experiment in which 3 9 or 15 f ixations were al-lowed on the scene With the exception of the second author all ofthe observers were naive as to the specific questions under investi-gation and all were paid $8h for their participation None of thesubjects required visual correction and all had normal color vision(both by self-report)

StimuliThe stimuli were scenes consisting of seven objects (teddy bear

baby bottle toy car box of crayons baby doll rubber ducky and toytrumpet) appearing in a babyrsquos crib These objects were arranged ina semicircle around the observerrsquos initial fixation point a 7ordm visualangle was subtended by this point and each objectrsquos center The an-gular positions of these seven objects relative to the fixation pointwere f ixed at 225ordm 45ordm 675ordm 90ordm 1125ordm 135ordm and 1575ordm al-though the identity of the object appearing at a particular positionvaried from trial to trial As a result of these placement constraints

the minimum and maximum center-to-center object separationswere 27ordm and 130ordm respectively Figure 1 shows a grayscale re-production of one of the actual scenes used in the experiment Each756 3 486 pixel image subtended 18ordm of visual angle horizontallyand 116ordm vertically at a viewing distance of 82 cm The componentobjects although irregular in shape could each fit within a 24ordmsquare bounding box Both the crib background and the objectswere displayed at 72 pixelsin in 16-bit RGB color Details regard-ing the construction and software manipulation of these images canbe found elsewhere (Zelinsky 1999 2001) The stimuli were dis-played on a Princeton Ultra-Synch monitor controlled by anATVista display controller card in a 386 computer and refreshed at60 Hz The pictures were displayed at a comfortable luminance ina dimly illuminated room shutter tests (McConkie amp Currie 1996)confirmed that visible phosphor persistence decayed from the dis-play screen within 12 msec after stimulus offset

ProcedureThe observers viewed the above-described scenes in preparation

for a recognition judgment following a spatial probe Precedingeach trial there was a fixation target at the point corresponding tothe origin of the semicircle in the study scene The observers wereasked to look carefully at this target (a 05ordm X in a box) and then topress a hand-held button to initiate the trial Once a scene appeared(Figure 2A) it remained visible for a variable period of time an in-terval constituting our primary experimental manipulation The ob-servers were allowed to freely view the scenes for exactly 1 3 or 5fixations (in Version 1 of the experiment) or for exactly 3 9 or 15fixations (in Version 2 of the experiment) depending on the condi-tion being instantiated on that particular trial As the observer rsquosgaze moved away from one of these critical fixations (eg the sac-cade following the 3rd f ixation in a 3-fixation trial as illustrated inFigure 2A) the scene was immediately replaced with a dark screenbefore the eye reached its next intended destination Note that theinitial fixation on the scene counted as one of these critical fixa-tions meaning that the observers would not be permitted to fixatean object in the 1-fixation condition

Just as the actual time that the observers had to inspect the scenedepended on both the fixation criteria and the durations of the in-dividual fixations so the blank delay immediately following thestudy scene also depended on the observerrsquos oculomotor behaviorThe reason for this variable delay was to ensure that the eye wasstable prior to the onset of the spatial probe If the probe appearedwhile the eye was still in motion this motion might mask the sud-den onset of the probe causing an ineffective direction of attentionto the desired location Our criteria for stability required that theeye be moving at a velocity less than 12 degsec for at least 50 msecfollowing the end of the critical saccade If the eye exceeded this ve-locity threshold at any time during the 50 msec the counter wouldbe reset and another delay would be initiated As a result of thesestability criteria the actual delay between study scene offset andprobe onset averaged 153 msec Following the postscene delay thetarget was designated by a perfectly valid spatial probe flashed fora brief 50 msec (Figure 2B) The probe was a white bar 075ordm inlength positioned along an imaginary line passing through the fix-ation point and the center of the target object The distance betweenthe fixation point and the nearest point on the probe was 87ordm plac-ing the probe just beyond the region described by the target bound-ing box (ie the target and the probe did not spatially overlap) A2-sec blank interval followed the offset of the probe after whichappeared a seven-alternative forced-choice response grid (Fig-ure 2C) The grid depicted the seven study objects in f ixed displaylocations (eg the teddy bear always appeared in the upper-left po-sition of the arch-shaped configuration) Object identity wasmapped consistently to display position so as to minimize the needfor the subjects to search for the target during response The ob-serversrsquo task was to select the object in the response grid indicated

884 IRWIN AND ZELINSKY

by the spatial probe Because the observers were on a bite-bar andcould not easily speak their method of indicating a selection wassimply to look at the desired object an act that caused a white boxto be drawn around the fixated item and then to press a hand-heldbutton when satisfied with their selection There was no accuracyfeedback and the time commitment per subject was less than 2 h

DesignEach observer participated in 147 trials with these trials evenly

divided into three randomly interleaved fixation conditions (1 3 or5 in Version 1 of the experiment and 3 9 or 15 in Version 2 of theexperiment) Postexperiment questioning revealed that the naiveobservers were unaware of these variable scene presentation timesbeing contingent upon their oculomotor behavior Within each con-dition the spatial probe and therefore the target appeared equallyoften (seven times) at each of the seven display locations Each ob-ject also served as the target seven times resulting in 147 trials (3fixation conditions 3 7 locations 3 7 objects) Except for the aboveconstraints the seven objects in each scene were randomly assignedto the seven locations

Eye Movement RecordingA Fourward Technologies Generation V dual Purkinje-image

eyetracker interfaced with a Tecmar 8-bit 12-channel analog-to-digital converter was used to sample eye position every millisecondthroughout each trial The spatial precision of this tracker duringfixation is 61 min of visual angle To achieve this level of preci-sion a dental impression bite-bar was constructed for each subjectwho was asked to remain on the bite-bar until scheduled breaksafter every 30 trials Prior to the start of the experiment and againbefore every block of trials the observers performed a calibration

procedure consisting of accurately fixating each of nine targets de-marcating the 18ordm 3 116ordm field of view As an internal validitycheck the calibration procedure would not terminate until repeatedfixations on each target resulted in an eye position error of less than03ordm Other than these calibration instructions and a reminder tolook at the fixation cross whenever it appeared on the screen theobservers were not told how to direct their gaze in this experiment Saccadic eye movements were extracted on line using a velocity-based algorithm implementing a roughly 125 degsec detectionthreshold Fixation duration was defined as the period between theoffset of one saccade and the onset of the next Fixations that fellwithin the virtual 24ordm 3 24ordm bounding box within which each ob-ject was presented were scored as being on the object

RESULTS AND DISCUSSION

Discarded DataTrial data were deleted from analysis if the eye was not

within the fixation box when the trial started or if the eye-tracker lost track of the eye during scene blank or probepresentation The proportion of deleted trials did not varysignificantly across conditions in Version 1 of the experi-ment 187 of the trials were deleted from the 1-fixationcondition 170 from the 3-f ixation condition and177 from the 5-fixation condition A procedural changewas implemented in Version 2 of the experiment in orderto reduce the number of trials lost owing to inaccurate fix-ation on the fixation cross that preceded scene presenta-

Figure 1 Grayscale reproduction of one of the scenes used in the experiment

EYE MOVEMENTS AND MEMORY 885

tion In particular to make certain that the subjects wereindeed looking at this cross the study scene would not ap-pear until the gaze deviated from the center of the cross byless than 1ordm at the time of the buttonpress This precautioneliminated the need to discard trials caused by anticipa-

tory saccades to the fixed display locations Thus trialdata were deleted from analysis in Version 2 of the exper-iment only if the eyetracker lost track of the eye duringscene blank or probe presentation The proportion ofdeleted trials in Version 2 did not vary significantly across

Figure 2 Highlights of the experimental procedure

A

B

C

886 IRWIN AND ZELINSKY

conditions 54 of the trials were deleted from the3-fixation condition 109 from the 9-fixation condi-tion and 92 from the 15-fixation condition

Scene Viewing Time and Number of ObjectsFoveated

Because the duration of the display on each trial wasdetermined by fixation condition scene viewing time in-creased as the number of fixations allowed on the sceneincreased (see Table 1) The mean viewing times rangedfrom 174 msec in the 1-fixation condition to 4927 msecin the 15-fixation condition As one might expect thenumber of objects foveated in the scene also increasedwith the number of fixations allowed on the scene In the1-fixation condition the scene was erased as soon as theeyes left the fixation box at the bottom of the scene sono objects were foveated in this condition At the otherextreme an average of 51 objects were foveated in the15-fixation condition (see Table 1) The number of ob-jects foveated in each condition was less than the num-ber of fixations because some fixations fell on no objectand some objects were foveated more than once Meanfixation duration tended to increase as the number of fix-ations on the scene increased (see Figure 3) The first 2fixations on the scene were quite short (approximately150 msec in duration) whereas the duration of subse-quent fixations was captured reasonably well (r2 5 77)by the following equation fixation duration 5 2376 155(fixation number)

Given these differences in viewing time and in the num-ber of objects foveated it would be reasonable to expectthat overall accuracy in reporting the probed object wouldincrease across fixation conditions This is discussed next

Overall AccuracyIn Version 1 of the experiment mean accuracy was

360 in the 1-fixation condition 658 in the 3-fixationcondition and 672 in the 5-fixation condition The ef-fect of f ixation condition was significant [F(210) 5419 MSe 5 446 p 001] A planned comparison basedon the error term of the analysis of variance (ANOVA)showed that differences of 78 were significant at the05 level thus all pairwise comparisons were significantexcept for the difference between the 3-fixation condi-tion and the 5-fixation condition In Version 2 of the ex-periment mean accuracy was 545 in the 3-fixationcondition 707 in the 9-fixation condition and 777in the 15-fixation condition The effect of fixation con-dition was significant [F(210) 5 288 MSe 5 294 p 001] A planned comparison based on the error term ofthe ANOVA showed that differences of 63 were sig-nificant at the 05 level thus all pairwise comparisonswere significant A t test comparing the two 3-fixationconditions was not significant [t(10) 5 21 p 05]Mean accuracy at reporting the probed object as a func-tion of the number of fixations on the scene is shown inFigure 4 (the data from the 3-fixation conditions werecombined for this figure)

A number of secondary analyses were done to inves-tigate whether practice or learning had any effect on ac-curacy Because the same objects appeared in differentpositions on every trial it seemed possible that the mem-ory of where an object appeared on trial n 2 1 might in-fluence recall of where that object had appeared on trialn (eg because of proactive interference) There was noevidence that accuracy declined over trials however InVersion 1 of the experiment mean accuracy during thefirst second and last third of the experiment was 527

Table 1Viewing Time and Number of Objects Foveated as a Function

of Fixation Condition

Fixation Condition Viewing Time (msec) Objects Foveated

1 174 0003 739 1255 1360 2689 2810 347

15 4927 510

Figure 3 Mean fixation duration as a function of number of fixations on the scene

EYE MOVEMENTS AND MEMORY 887

551 and 589 In Version 2 of the experiment thecorresponding accuracies were 683 671 and683 There was also no evidence that probing thesame target object (eg the duck) in different positionson successive trials influenced performance accuracywas 626 when the target object on trial n had also beenthe target object (but in a different position) on trial n 21 and accuracy was 621 when different target objectswere probed on trial n and trial n 2 1 (there were no tri-als in which the same target object appeared in the same

probed position on successive trials) In sum there wasno evidence that learning or repetition effects interferedwith memory recall

The analyses that follow were designed to illuminatewhat factors did affect overall accuracy

Accuracy 3 Probe PositionPrevious partial-report experiments have found that

accuracy depends strongly on an itemrsquos position in a dis-play (eg Mewhort Campbell Marchetti amp Campbell

Fixation Condition

Per

cent

age

Cor

rect

100

90

80

70

60

50

40

30

20

10

01 3 5 9 15

Figure 4 Mean accuracy as a function of the number of fixations on the scene (error barsrepresent the standard error of the mean)

1 fixation3 fixations5 fixations9 fixations15 fixations

Position

Per

cent

age

Cor

rect

100

90

80

70

60

50

40

30

20

10

01 2 3 4 5 6 7

Figure 5 Mean accuracy as a function of probe position for each fixation condition

888 IRWIN AND ZELINSKY

1981) Figure 5 shows accuracy as a function of whichposition in the scene was probed for each fixation con-dition Position 1 refers to the leftmost object in thescene Position 7 to the rightmost and Position 4 to theobject directly above the fixation point Accuracy in theone-fixation condition showed a U-shaped function withposition so that objects at the sides of the display wereremembered better than objects at other array positionsThere are probably several factors contributing to this re-sult There is less lateral masking for the two end itemsprevious researchers have found that both retinal acuityand visual attention gradients are not circular but ratherare elongated in the horizontal dimension (eg Pan ampEriksen 1993) and subjects may have shifted their at-tention preferentially to the end items as part of a high-level scanning strategy

Of more interest is what happened to the accuracy 3probe position function when more fixations were madeon the display Recall for example that overall accuracyincreased as the number of fixations increased from 1 to3 fixations on the scene Figure 5 shows that not all arraypositions benefited equally however Rather accuracy forthe center-left items in the display increased more than ac-curacy for the center-right items in the display Increasingthe number of fixations from 3 to 5 had similar effects Incontrast accuracy for the rightmost items in the sceneincreased whereas accuracy for the leftmost items in thescene decreased when the number of fixations on thescene increased from 5 to 9 and from 9 to 15 In generalFigure 5 shows that peak accuracy for positions in thearray occurs early for positions on the left side of thearray and late for positions on the right side of the array

It seemed possible that the changes in accuracy acrossprobe position with increasing number of fixations onthe scene might be due to the subjectsrsquo viewing strategies

Figure 6 shows the percentage of fixations on different ob-ject positions in the scene as a function of fixation num-ber (eg 33 of all second fixations fell on Position 13 of all second fixations fell on Position 2 and so on)1In general the pattern of fixations looks very similar tothe pattern of accuracies shown in Figure 5 The subjectstended to fixate the leftmost items in the array early inscene viewing but shifted to rightward positions later inscene viewing There was a great deal of variability be-tween and even within subjects in terms of their viewingstrategies however Two subjects (1 in Version 1 of theexperiment and 1 in Version 2) fixated the objects fromleft to right on most trials 1 subject fixated the objectsfrom right to left on most trials 1 subject moved from leftto right on half the trials and right to left on the other half1 usually fixated Position 1 then Position 4 then Posi-tion 7 skipping over the other positions 1 usually fixatedPosition 4 first and then moved left 1 usually fixated Po-sition 4 first and then moved right and the remainderused a mixture of these strategies changing from one toanother during the course of the experiment There waslittle difference in accuracy between the 3 subjects whoconsistently used a left-to-right or a right-to-left strategy(591) and the 9 who did not (628)

The similarity between the pattern of accuracies in Fig-ure 5 and the pattern of eye fixations in Figure 6 suggeststhat accuracy was affected by whether (and perhaps when)the probed item was foveated during scene presentationThis was examined directly in the next two analyses

Accuracy for Foveated Versus NonfoveatedObjects

On some trials the subjects foveated the object thathappened to be probed whereas on other trials they didnot Figure 7 shows accuracy for foveated versus non-

Position

Per

cent

age

of F

ixat

ions

40

35

30

25

20

15

10

5

01 2 3 4 5 6 7

Fix 2

Fix 3

Fix 4

Fix 5

Fix 6

Fix 7

Fix 8

Fix 9

Fix 10

Fix 11

Fix 12

Fix 13

Fix 14

Fix 15

Figure 6 The percentage of fixations on each object position as a function of fixation number

EYE MOVEMENTS AND MEMORY 889

foveated targets as a function of the number of fixationson the scene No targets were foveated in the 1-fixationcondition because the scene disappeared during the firstsaccade In the 3- 5- and 9-fixation conditions how-ever accuracy was much higher if the position that wasprobed had been foveated as compared with when it hadnot been foveated during scene presentation [929 vs530 in the 3-fixation case t(11) 5 76 p 001842 vs 565 in the 5-fixation case t(5) 5 26 p 05 858 vs 559 in the 9-fixation case t(5) 5 53p 005] The benefit of foveation (808 vs 744)was not significant when 15 fixations had been made onthe scene however [t(5) 5 10 p 35] An inspectionof Figure 7 suggests that this occurred for two reasonsFirst accuracy for nonfoveated objects increased sub-

stantially (from 530 to 744) as the number of fixa-tions on the scene increased from 3 to 15 presumablybecause with an increasing number of f ixations theeyes were landing near the nonfoveated objects even ifthey were not landing on them second accuracy forfoveated objects decreased as the number of fixations in-creased This suggests that recency of fixation on the tar-get object might also be important (ie the 15-fixationcondition includes fixations on the target further back intime than does the 3-fixation condition) This was ex-amined in the next analysis

Accuracy as a Function of Fixation RecencyFigure 8 shows accuracy for foveated targets only as a

function of when during the trial the object had been

Number of Fixations

Per

cent

age

Cor

rect

100

90

80

70

60

50

40

30

20

10

0

1 3 5 9 15

FoveatedNonfoveated

Figure 7 Mean accuracy for foveated versus nonfoveated targets as a function of thenumber of fixations on the scene (an error bar represents the standard error of themean)

Recency of Fixation

Per

cent

age

Cor

rect

100

90

80

70

60

50

40

30

20

10

00 ndash1 ndash2 ndash3 ndash 4 ndash 5

Figure 8 Mean accuracy for foveated targets as a function only of when during the trialthe target had been foveated (an error bar represents the standard error of the mean)

890 IRWIN AND ZELINSKY

foveated If the object that had been foveated just beforethe critical saccade (ie the saccade that terminated thescene) happened to be probed for report (denoted as fix-ation recency 0 in the figure) the subjects reported itcorrectly on over 90 of the trials If the object probedfor report had been foveated 1 or 2 fixations earlier (cor-responding to fixation recencies 21 and 22 in the fig-ure) accuracy was slightly lower (approximately 80)Probing an object that had been foveated 3 or more fix-ations before the critical saccade resulted in accuraciesthat were considerably lower (approximately 65) andnot much different from the accuracy obtained when anonfoveated object had been probed for report (59)The effect of fixation recency (using six levels of re-cency) was significant [F(525) 5 285 MSe 5 2432p 05] in a one-way repeated measures ANOVA thatwas conducted on the data from Version 2 of the experi-ment (this version allowed 3 9 or 15 fixations on thedisplay and thus provided a wide range of fixation re-cencies) A planned comparison based on the error termof the ANOVA showed that differences of 145 weresignificant at the 05 level thus accuracy for the threeobjects foveated just before the critical saccade (iewith fixation recencies of 0 21 and 22) was signifi-cantly higher than accuracy for objects foveated 3 ormore fixations back These results suggest that fixationrecency does influence accuracy in particular it appearsthat people remember the last three objects foveated inthe scene much better than they remember objectsfoveated earlier in time (see Zelinsky amp Loschky 1998for similar results) There was no evidence of a primacyeffect if the object probed for report was the first objectfoveated (and had been foveated 3 or more fixations be-fore the critical saccade) accuracy was 66

Accuracy as a Function of the Number ofObjects Foveated

Given the importance of foveation to scene memorythat is revealed by the preceding analyses one might ex-pect that overall accuracy would depend on the numberof unique objects that were foveated in the scene (recallthat 7 objects appeared in the scene) This relationship isshown in Figure 9 Accuracy increased as 0ndash6 objectswere foveated but there was no additional increase in ac-curacy from 6 to 7 objects Accuracy peaked at 80 evenwhen 6 or 7 unique objects were foveated An accuracylevel of 80 corresponds to memory for 54 objects inthe scene2 This result suggests that scene representa-tions contain no more than about f ive object positionpairings regardless of the number of f ixations on thescene This is not to say that 5 discrete objects in all theirfull detail are stored on every trial however rather thisestimate represents the average amount of informationstored from what is most likely a stochastic processThus it is probably most accurate to say that approxi-mately 5 objectsrsquo ldquoworthrdquo of information is stored in theon-line scene representation

Accuracy When Final Saccade Was Directed atthe Target Versus Directed Elsewhere

Previous research has shown that movements of atten-tion precede movements of the eyes to locations in spacethereby facilitating the processing of information that ispresent at the saccade target location (eg Deubel ampSchneider 1996 Henderson 1993 Henderson Pollat-sek amp Rayner 1989 Hoffman amp Subramaniam 1995Irwin amp Gordon 1998 Klein 1980 Klein amp Pontefract1994 Kowler Anderson Dosher amp Blaser 1995Rayner McConkie amp Ehrlich 1978 Shepherd Findlay

0 1 2 3 4 5 6 7

100

90

80

70

60

50

40

30

20

10

0

Number of Objects Foveated

Per

cent

age

Cor

rect

Figure 9 Mean accuracy as a function of the number of unique objects that were foveatedduring scene presentation (an error bar represents the standard error of the mean)

EYE MOVEMENTS AND MEMORY 891

amp Hockey 1986) Thus we expected that accuracywould be higher when the critical saccade was directedat the object that happened to be probed on any giventrial as compared with when the critical saccade was di-rected toward an unprobed object To elaborate considera one-fixation trial in which the baby bottle was pre-sented in the leftmost position of the display Supposethat the subject moved hisher eyes from the central fix-ation box toward this position Because this was a one-fixation trial the scene disappeared as soon as the sac-cade was initiated to the bottle so nothing was presenton the screen when the eyes landed A short time laterthe probe was presented at one of the display positionsand on some trials it just happened to be presented at thelocation to which the eyes were sent on the critical sac-cade the question of interest is whether accuracy washigher when the probed item happened to be the itemthat the eyes were sent to on the critical saccade thanwhen it was not

Figure 10 shows accuracy for trials in which theprobed item was or was not the target of the critical sac-cade in each fixation condition (by target of the saccadewe mean that the eyes landed within the virtual 24ordmsquare bounding box that defined object position) Onlytrials in which the probed item was not fixated at anytime during scene presentation are included In everyfixation condition accuracy was much higher when theprobed item was the target of the final saccade than whenit was not [778 vs 309 in the 1-f ixation caset(5) 5 26 p 05 929 vs 515 in the 3-fixationcase t(11) 5 60 p 001 879 vs 513 in the 5-fixation case t(5) 5 36 p 02 998 vs 658 in the9-fixation case t(5) 5 102 p 001 and 983 vs751 in the 15-fixation case t(5) 5 49 p 005]

In sum accuracy at reporting the probed item wasvery high if the probe happened to be presented at the lo-cation to which the eyes had been sent on the critical sac-cade even though the object at that location had neverbeen foveated during the trial merely being the target ofthe final saccade increased its memorability consider-ably (see Henderson amp Hollingworth 1999b for similarfindings) This presumably occurred because attentionpreceded the eyes to the saccade target location The ef-fect of attention is to increase the likelihood that infor-mation at the saccade target location will be encodedinto onersquos mental representation of the scene In fact itis possible that many of the beneficial effects of foveat-ing an object discussed above are not due to foveationper se but rather to the attention shift that preceded it

DISCUSSION

In the last 20 years there has been an explosion of in-terest in the properties of on-line scene representation(eg Aginsky amp Tarr 2000 Blackmore et al 1995De Graef 1992 Grimes 1996 Hayhoe 2000 Hender-son amp Hollingworth 1999a 1999b Hollingworth ampHenderson 2000 2002 Hollingworth et al 2001 In-traub 1997 Irwin 1991 1992 Irwin amp Andrews 1996Irwin et al 1988 Irwin et al 1983 Irwin Zacks ampBrown 1990 Levin amp Simons 1997 McConkie amp Cur-rie 1996 Mondy amp Coltheart 2000 OrsquoRegan 1992OrsquoRegan amp Levy-Schoen 1983 Pringle Irwin Krameramp Atchley 2001 Rayner amp Pollatsek 1983 Rensink2000 Rensink et al 1997 Simons 1996 2000 Simonsamp Levin 1998 Wallis amp Bulthoff 2000) One commonfinding is that scene representations appear to be sparseand undetailed Consistent with previous research the

1 3 5 9

100

90

80

70

60

50

40

30

20

10

0

Number of Fixations

Per

cent

age

Cor

rect

15

DirectedNot Directed

Figure 10 Mean accuracy when the final saccade was directed at the target versusnot directed at the target as a function of the number of fixations on the scene (anerror bar represents the standard error of the mean)

892 IRWIN AND ZELINSKY

results of our experiment also indicate that scene repre-sentations are relatively sparse even after 15 fixationson a scene our subjects remembered the positionidentitypairings of only about 78 of the objects in our seven-object displays or the equivalent of about five objectsrsquoworth of information But more important our resultsprovide new information about what factors influencethe contents of the on-line scene representation and howthis representation changes during the course of sceneviewing In particular our analyses suggest that the lastthree items that are foveated and the item about to befoveated are actually remembered quite well much bet-ter than other items in a scene In other words on-linescene representations are dynamic so that the memorytraces of recently fixated objects are much stronger thanthose of objects viewed several fixations back in timeInformation about a scene does appear to accumulateover multiple fixations but the capacity of the on-linescene representation appears to be limited to about fiveitems In most respects the results of the present studyare consistent with the results of other studies that haveused the transsaccadic partial-report technique to inves-tigate integration across saccades with nonscene dis-plays (eg Irwin 1992 Irwin amp Andrews 1996 Irwinamp Gordon 1998)

What is the nature of the memory representation under-lying performance in our task Given that the same sevenobjects appeared in fixed positions on each trial onemight imagine that subjects would simply adopt a left-to-right reading strategy and would store an ordered list ofobject names in verbal working memory Then when theprobe was presented they would scan their memorizedlist and report the appropriate item Three aspects of ourdata seem inconsistent with this hypothesis howeverFirst as was noted earlier for the most part the subjectsdid not use a left-to-right reading strategy nor did theyeven adopt any consistent strategy across trials Thisvariability would greatly complicate memory retrieval ifthe subjects had stored an ordered list of object names inverbal working memory because there would be no con-sistent mapping between probe position and the positionof an item in verbal working memory That is if thememory probe appeared at Position 6 say this wouldcorrespond to the sixth position in the list if the subjectscanned the array from left to right the second positionin the list if the subject scanned from right to left thethird position if the subject started in the middle and thenmoved right and so on The fact that the subjects did notadopt a fixed scanning order suggests to us that theywere not storing an ordered list of object names in ver-bal working memory Second the fact that the subjectsremembered only a maximum of five objects from thedisplay also seems inconsistent with the verbal workingmemory hypothesis because the capacity of verbalworking memory is generally assumed to be betweenfive and nine items given 5 sec to study the display wewould think that the subjects would be able to remember

more than five items if they were using verbal workingmemory Finally the absence of a primacy effect in ourdata also seems inconsistent with the verbal workingmemory hypothesis

Instead we believe that our results are more consistentwith the object-f ile theory of transsaccadic memory(Irwin 1992 1996 Irwin amp Andrews 1996) This theoryis based on the theoretical framework for object percep-tion proposed by Kahneman and Treisman (1984) andKahneman Treisman and Gibbs (1992) It contains fourlevels of representation feature maps which register in-dependently the presence of different sensory features ina scene such as color and shape a master map of loca-tions which registers where in a scene features are lo-cated temporary object representations called objectfiles which are formed when features are conjoined intounitary wholes via attention and which thus representwhat objects are located where in a scene and an abstractlong-term recognition network that stores descriptions ofobjects along with their names Object files may containvisual verbal and semantic information about objectsbecause they have access to long-term memory as well asto perceptual input According to the object-file theoryof transsaccadic memory relatively little information ispreserved across eye movements Rather transsaccadicmemory (and by extension the on-line scene representa-tion) consists of 3mdash5 object files that contain positionand identity information for items that have been at-tended in the scene and of residual activation in the fea-ture maps and in the long-term recognition network Theresults of the present study are consistent with the object-file theory because even after 15 fixations on a scenememory for the scene was limited and consisted largelyof the last few objects that had been attended in the scene

Although the results of the present study are consis-tent with the object-file theory they seem not entirelyconsistent with another theory of scene representationproposed recently by Rensink (2000) According toRensinkrsquos coherence theory in the absence of focusedattention volatile low-level proto-objects are continuallybeing formed and replaced rapidly and in parallel acrossthe visual field reflecting whatever is present on theretina at any moment in time Focused attention providesa coherence field that selects a small number of theseproto-objects to form a stable individuated object thathas spatiotemporal continuity This stable object doesnot necessarily correspond to a single element in the vi-sual field however but may represent a supra-object (cfPylyshynrsquos FINSTs Pylyshyn amp Storm 1988) After fo-cused attention is released the object loses its coherenceand dissolves back into its constituent proto-object partsOnce that occurs there is little or no consequence of hav-ing been attended According to coherence theory visualshort-term memory corresponds to the coherence fieldit contains object tokens (ie specific instantiations at aparticular place and time) that are currently in the focusof attention but that disappear when attention is with-

EYE MOVEMENTS AND MEMORY 893

drawn The coherence theory does allow for the exis-tence of a nonvisual short-term memory for previouslyattended items but it posits that this memory containsinformation only about object types (general definitionsabstracted away from specific spatiotemporal contexts)

The results of the present study seem to raise problemsfor the coherence theory because object tokens were re-tained in memory for recently fixated objects that wereno longer in the focus of attention Consistent with co-herence theory we found that accuracy was very highwhen the object at the final saccade target location(hence the object at the focus of attention) was probedfor report However we also found that accuracy wasquite high for the three objects that had been fixated justprior to this Since attention movements precede eyemovements in an obligatory fashion the three objectsviewed in prior fixations could not also be at the focus ofattention The fact that accuracy was not uniform for thefour objects in question also indicates that they were notall simultaneously at the focus of attention Rather theobjects viewed during the fixations just preceding thefinal saccade benefited from prior attentional process-ing This is inconsistent with the claim that once focusedattention is released objects dissolve back into their con-stituent proto-objects with no consequence of havingbeen attended Note also that object tokens and not ob-ject types were being remembered in our experiment inorder to respond correctly to the partial-report probe thesubjects had to remember the position and the identityof the probed itemmdashthat is they had to remember an ob-ject token Merely remembering that an object type hadappeared somewhere in the display would not lead to anaccurate response especially since the same seven itemsappeared on every trial

Although the present study provides important infor-mation about the factors that influence the contents of on-line scene representations two limitations should benoted One is that our study measured only short-termmemory contributions to the on-line scene representationand not potential long-term memory contributions Thesame seven objects appeared in the same scene context onevery trial so top-down contributions to performancewere essentially eliminated In the real world scene rep-resentations are bound to be influenced by knowledgeabout what kinds of objects are typically found in a par-ticular scene context and where those objects are usuallylocated Indeed recent research by Hollingworth andHenderson (2002 see also Hollingworth et al 2001) inwhich a novel saccade-contingent change detection pro-cedure was used has demonstrated that long-term mem-ory plays a crucial role in the construction and mainte-nance of on-line scene representations

A second limitation of our study is that it involvedonly a test of explicit memory Recently several investi-gators have found evidence that explicit tests of scenememory may underestimate the amount of informationin the representation (eg Fernandez-Duque amp Thornton2000 Hollingworth et al 2001 Williams amp Simons

2000) For example Hollingworth et al found that chang-ing an object in a scene often affected fixation durationon that object even though subjects failed to report thata change had occurred This suggests that at some levelthe perceptual system detected the change even thoughthe subjects were not explicitly aware of it It is impor-tant to note that the implicit effects that have been ob-served have often been small however Furthermoreeven though implicit measures of performance (eg eyemovements) appear to be more sensitive to change thanare explicit measures it does not follow from this thatscene representations contain a great deal more infor-mation than has been measured by explicit tests Explicittests may underestimate the amount of information in thescene representation but there is no evidence that theyunderestimate it by much This is still an open question

REFERENCES

Aginsky V amp Tarr M (2000) How are different properties of a sceneencoded in visual memory Visual Cognition 7 147-162

Averbach E amp Coriell E (1961) Short-term memory in visionBell System Technical Journal 40 309-328

Blackmore S J Brelstaff G Nelson K amp Troscianko T(1995) Is the richness of our visual world an illusion Transsaccadicmemory for complex scenes Perception 24 1075-1081

Bridgeman B Hendry D amp Stark L (1975) Failure to detect dis-placement of the visual world during saccadic eye movements Vi-sion Research 15 719-722

Bridgeman B amp Mayer M (1983) Failure to integrate visual infor-mation from successive fixations Bulletin of the Psychonomic Soci-ety 21 285-286

Bridgeman B van der Heijden A amp Velichkovsky B (1994) Atheory of visual stability across saccadic eye movements Behavioralamp Brain Sciences 17 247-292

Busey T amp Loftus G (1994) Sensory and cognitive components ofvisual information acquisition Psychological Review 101 446-469

Currie C B McConkie G W Carlson-Radvansky L A ampIrwin D E (2000) The role of the saccade target object in the per-ception of a visually stable world Perception amp Psychophysics 62673-683

De Graef P (1992) Scene-context effects and models of real-worldperception In K Rayner (Ed) Eye movements and visual cognitionScene perception and reading (pp 243-259) New York Springer-Verlag

Deubel H amp Schneider W X (1996) Saccade target selection andobject recognition Evidence for a common attentional mechanismVision Research 36 1827-1837

Fernandez-Duque D amp Thornton I M (2000) Change detectionwithout awareness Do explicit reports underestimate the representa-tion of change in the visual system Visual Cognition 7 324-344

Grimes J (1996) On the failure to detect changes in scenes across sac-cades In K Akins (Ed) Perception (Vancouver Studies in CognitiveScience Vol 5 pp 89-110) New York Oxford University Press

Hayhoe M M (2000) Vision using routines A functional account ofvision Visual Cognition 7 43-64

Henderson J M (1993) Visual attention and saccadic eye move-ments In G drsquoYdewalle amp J Van Rensbergen (Eds) Perception andcognition Advances in eye-movement research (pp 37-50) Amster-dam North-Holland

Henderson J M (1997) Transsaccadic memory and integration dur-ing real-world object perception Psychological Science 8 51-55

Henderson J M amp Hollingworth A (1999a) High-level sceneperception Annual Review of Psychology 50 243-271

Henderson J M amp Hollingworth A (1999b) The role of fixationposition in detecting scene changes across saccades PsychologicalScience 10 438-443

894 IRWIN AND ZELINSKY

Henderson J M Pollatsek A amp Rayner K (1989) Covert visualattention and extrafoveal information use during object identifica-tion Perception amp Psychophysics 45 196-208

Hochberg J (1986) Representation of motion and space in video andcinematic displays In K R Boff L Kaufman amp J P Thomas (Eds)Handbook of perception and human performance Vol 1 Sensoryprocesses and perception (pp 221-2264) New York Wiley

Hoffman J E amp Subramaniam B (1995) The role of visual atten-tion in saccadic eye movements Perception amp Psychophysics 57787-795

Hollingworth A amp Henderson J M (2000) Semantic informa-tiveness mediates the detection of changes in natural scenes VisualCognition 7 213-235

Hollingworth A amp Henderson J M (2002) Accurate visualmemory for previously attended objects in natural scenes Journal ofExperimental Psychology Human Perception amp Performance 28113-136

Hollingworth A Williams C C amp Henderson J M (2001) Tosee and remember Visually specific information is retained in mem-ory from previously attended objects in natural scenes PsychonomicBulletin amp Review 8 761-768

Intraub H (1997) The representation of visual scenes Trends in Cog-nitive Sciences 1 217-222

Irwin D E (1991) Information integration across saccadic eye move-ments Cognitive Psychology 23 420-456

Irwin D E (1992) Memory for position and identity across eye move-ments Journal of Experimental Psychology Learning Memory ampCognition 18 307-317

Irwin D E (1996) Integrating information across saccadic eye move-ments Current Directions in Psychological Science 5 94-100

Irwin D E amp Andrews R V (1996) Integration and accumulationof information across saccadic eye movements In T Inui amp J L Mc-Clelland (Eds) Attention and performance XVI Information inte-gration in perception and communication (pp 125-155) CambridgeMA MIT Press

Irwin D E Brown J S amp Sun J-S (1988) Visual masking and vi-sual integration across saccadic eye movements Journal of Experi-mental Psychology General 117 276-287

Irwin D E amp Gordon R D (1998) Eye movements attention andtranssaccadic memory Visual Cognition 5 127-155

Irwin D E Yantis S amp Jonides J (1983) Evidence against visualintegration across saccadic eye movements Perception amp Psycho-physics 34 49-57

Irwin D E Zacks J L amp Brown J S (1990) Visual memory andthe perception of a stable visual environment Perception amp Psycho-physics 47 35-46

Kahneman D amp Treisman A (1984) Changing views of attentionand automaticity In R Parasuraman amp D R Davies (Eds) Varietiesof attention (pp 29-61) Orlando FL Academic Press

Kahneman D Treisman A amp Gibbs B J (1992) The reviewing ofobject f iles Object-specific integration of information CognitivePsychology 24 175-219

Klein R (1980) Does oculomotor readiness mediate cognitive controlof visual attention In R S Nickerson (Ed) Attention and perfor-mance VIII (pp 259-276) Hillsdale NJ Erlbaum

Klein R amp Pontefract A (1994) Does oculomotor readiness me-diate cognitive control of visual attention Revisited In C Umiltagrave ampM Moskovitch (Eds) Attention and performance XV Consciousand nonconscious information processing (pp 333-350) CambridgeMA MIT Press Bradford Books

Kowler E Anderson E Dosher B amp Blaser E (1995) The roleof attention in the programming of saccades Vision Research 351897-1916

Levin D T amp Simons D J (1997) Failure to detect changes to at-tended objects in motion pictures Psychonomic Bulletin amp Review4 501-506

Matin E (1974) Saccadic suppression A review and an analysis Psy-chological Bulletin 81 899-917

McConkie G W amp Currie C B (1996) Visual stability across sac-cades while viewing complex pictures Journal of Experimental Psy-chology Human Perception amp Performance 22 563-581

McConkie G W amp Zola D (1979) Is visual information integratedacross successive fixations in reading Perception amp Psychophysics25 221-224

Mewhort D J K Campbell A J Marchetti F M amp CampbellJ I D (1981) Identif ication localization and ldquoiconic memoryrdquo Anevaluation of the bar-probe task Memory amp Cognition 9 50-67

Mondy S amp Coltheart V (2000) Detection and identification ofchange in naturalistic scenes Visual Cognition 7 281-296

OrsquoRegan J K (1992) Solving the ldquorealrdquo mysteries of visual percep-tion The world as an outside memory Canadian Journal of Psy-chology 46 461-488

OrsquoRegan J K amp Levy-Schoen A (1983) Integrating visual infor-mation from successive fixations Does trans-saccadic fusion existVision Research 23 765-768

Pan K amp Eriksen C W (1993) Attentional distribution in the visualfield during samendashdifferent judgments as assessed by response com-petition Perception amp Psychophysics 53 134-144

Pashler H (1988) Familiarity and visual change detection Percep-tion amp Psychophysics 44 369-378

Phillips W A (1974) On the distinction between sensory storage andshort-term visual memory Perception amp Psychophysics 16 283-290

Pollatsek A Rayner K amp Henderson J (1990) The role of spa-tial location in the integration of pictorial information across sac-cades Journal of Experimental Psychology Human Perception ampPerformance 16 199-210

Pringle H L Irwin D E Kramer A F amp Atchley P (2001)The role of attentional breadth in perceptual change detection Psy-chonomic Bulletin amp Review 8 89-95

Pylyshyn Z W amp Storm R W (1988) Tracking multiple indepen-dent targets Evidence for a parallel tracking mechanism Spatial Vi-sion 3 179-197

Rayner K McConkie G W amp Ehrlich S (1978) Eye movement sand integrating information across f ixations Journal of Experimen-tal Psychology Human Perception amp Performance 4 529-544

Rayner K McConkie G W amp Zola D (1980) Integrating infor-mation across eye movements Cognitive Psychology 12 206-226

Rayner K amp Pollatsek A (1983) Is visual information integratedacross saccades Perception amp Psychophysics 34 39-48

Rensink R A (2000) The dynamic representation of scenes VisualCognition 7 17-42

Rensink R A OrsquoRegan J K amp Clark J J (1997) To see or not tosee The need for attention to perceive changes in scenes Psycho-logical Science 8 368-373

Shepherd M Findlay J amp Hockey R (1986) The relationship be-tween eye movements and spatial attention Quarterly Journal of Ex-perimental Psychology 38A 475-491

Simons D J (1996) In sight out of mind When object representa-tions fail Psychological Science 7 301-305

Simons D J (2000) Current approaches to change blindness VisualCognition 7 1-15

Simons D J amp Levin D T (1997) Change blindness Trends in Cog-nitive Sciences 1 261-267

Simons D J amp Levin D T (1998) Failure to detect changes to peo-ple during a real-world interaction Psychonomic Bulletin amp Review5 644-649

Sperling G (1960) The information available in brief visual presen-tations Psychological Monographs 74(11 Whole No 498)

Van Dijk T amp Kintsch W (1983) Strategies of discourse compre-hension New York Academic Press

Wallis G amp Bulthoff H (2000) Whatrsquos scene and not seen Influ-ences of movement and task upon what we see Visual Cognition 7175-190

Williams P amp Simons D J (2000) Detecting changes in novel 3Dobjects Effects of change magnitude spatiotemporal continuity andstimulus familiarity Visual Cognition 7 297-322

Zelinsky G J (1999) Precuing target location in a variable set sizeldquononsearchrdquo task Dissociating search-based and interference-basedexplanations for set size effects Journal of Experimental Psychol-ogy Human Perception amp Performance 25 875-903

Zelinsky G J (2001) Eye movements during change detection Im-

EYE MOVEMENTS AND MEMORY 895

plications for search constraints memory limitations and scanningstrategies Perception amp Psychophysics 63 209-225

Zelinsky G J amp Loschky L (1998) Toward a realistic assessmentof visual working memory Investigative Ophthalmology amp VisualScience 39 S224

Zelinsky G J Rao R Hayhoe M amp Ballard D (1997) Eyemovements reveal the spatiotemporal dynamics of visual search Psy-chological Science 8 448-453

NOTES

1 The percentage of f ixations summed across positions in Figure 6does not add up to 100 because some fixations did not fall on any ob-ject position This happened relatively rarely for Fixations 3ndash15 rang-ing from 3 to 9 of all the fixations It happened quite often for Fix-ation 2 (the first real f ixation on the scene since Fixation 1 was on thefixation point) accounting for 45 of all the fixations Approximately

67 of these fell near an object position whereas the remainder werescattered randomly in the central area of the scene these may have beenldquocenter of massrdquo fixations like those previously observed by ZelinskyRao Hayhoe and Ballard (1997)

2 In order to determine how many objects would have to be stored inmemory to produce an accuracy level of 80 we first to correct forguessing (Busey amp Loftus 1994) applied the formula p 5 (x 2 g)(1 2g) where x is the raw proportion correct g is the guessing probabilityand p is the corrected proportion correct (note that g 5 143 or 17 be-cause there were always seven objects in the array) We then multipliedp by the number of objects in the display in order to estimate the num-ber of objects remembered (Sperling 1960)

(Manuscript received March 9 2001revision accepted for publication November 28 2001)

EYE MOVEMENTS AND MEMORY 883

is interrupted by an occluding object (Simons amp Levin1998) In short people are strikingly bad at detectingchanges in the visual world whenever low-level motioncues are eliminated by eye movements flicker or otherdisruptive visual events (see Simons amp Levin 1997 for areview) These findings indicate that people do not haveunlimited access to a richly detailed mental representa-tion of a scene otherwise these changes would be quitenoticeable

So what is contained in onersquos on-line representation ofa scene It seems clear that some information is repre-sented we are not startled every time we move our eyesand take in a new view of the world for example Thepurpose of the present research was to begin to deter-mine the characteristics of on-line scene representationsIn particular we investigated how much people remem-ber from a simple scene what factors influence theirmemory and how their on-line scene representationchanges as a function of the number of fixations on thescene

To address the nature of on-line scene representationsa partial-report technique (Averbach amp Coriell 1961)was used to assess peoplersquos memory for pictures of sim-ple scenes The scenes always used a babyrsquos crib as abackground and contained seven objects arrayed inseven fixed locations in the crib The subjects began eachtrial by fixating a plus sign in the bottom center of thecrib scene They were instructed to inspect the scene andto try to remember which objects were located whereThe subjectrsquos eye position was monitored with an eye-tracker and during some critical saccade (the 1st the3rd the 5th the 9th or the 15th) the scene was eraseda delay ensued until the subjectrsquos eye settled into its newposition and then a partial-report probe a bar markerwas presented near one of the previously occupied criblocations The subjectrsquos task was to report the object thathad appeared in the probed position

METHOD

SubjectsTwelve observers participated in this experiment Six subjects

completed a version of the experiment in which 1 3 or 5 fixationswere allowed on the scene whereas the other 6 subjects completeda version of the experiment in which 3 9 or 15 f ixations were al-lowed on the scene With the exception of the second author all ofthe observers were naive as to the specific questions under investi-gation and all were paid $8h for their participation None of thesubjects required visual correction and all had normal color vision(both by self-report)

StimuliThe stimuli were scenes consisting of seven objects (teddy bear

baby bottle toy car box of crayons baby doll rubber ducky and toytrumpet) appearing in a babyrsquos crib These objects were arranged ina semicircle around the observerrsquos initial fixation point a 7ordm visualangle was subtended by this point and each objectrsquos center The an-gular positions of these seven objects relative to the fixation pointwere f ixed at 225ordm 45ordm 675ordm 90ordm 1125ordm 135ordm and 1575ordm al-though the identity of the object appearing at a particular positionvaried from trial to trial As a result of these placement constraints

the minimum and maximum center-to-center object separationswere 27ordm and 130ordm respectively Figure 1 shows a grayscale re-production of one of the actual scenes used in the experiment Each756 3 486 pixel image subtended 18ordm of visual angle horizontallyand 116ordm vertically at a viewing distance of 82 cm The componentobjects although irregular in shape could each fit within a 24ordmsquare bounding box Both the crib background and the objectswere displayed at 72 pixelsin in 16-bit RGB color Details regard-ing the construction and software manipulation of these images canbe found elsewhere (Zelinsky 1999 2001) The stimuli were dis-played on a Princeton Ultra-Synch monitor controlled by anATVista display controller card in a 386 computer and refreshed at60 Hz The pictures were displayed at a comfortable luminance ina dimly illuminated room shutter tests (McConkie amp Currie 1996)confirmed that visible phosphor persistence decayed from the dis-play screen within 12 msec after stimulus offset

ProcedureThe observers viewed the above-described scenes in preparation

for a recognition judgment following a spatial probe Precedingeach trial there was a fixation target at the point corresponding tothe origin of the semicircle in the study scene The observers wereasked to look carefully at this target (a 05ordm X in a box) and then topress a hand-held button to initiate the trial Once a scene appeared(Figure 2A) it remained visible for a variable period of time an in-terval constituting our primary experimental manipulation The ob-servers were allowed to freely view the scenes for exactly 1 3 or 5fixations (in Version 1 of the experiment) or for exactly 3 9 or 15fixations (in Version 2 of the experiment) depending on the condi-tion being instantiated on that particular trial As the observer rsquosgaze moved away from one of these critical fixations (eg the sac-cade following the 3rd f ixation in a 3-fixation trial as illustrated inFigure 2A) the scene was immediately replaced with a dark screenbefore the eye reached its next intended destination Note that theinitial fixation on the scene counted as one of these critical fixa-tions meaning that the observers would not be permitted to fixatean object in the 1-fixation condition

Just as the actual time that the observers had to inspect the scenedepended on both the fixation criteria and the durations of the in-dividual fixations so the blank delay immediately following thestudy scene also depended on the observerrsquos oculomotor behaviorThe reason for this variable delay was to ensure that the eye wasstable prior to the onset of the spatial probe If the probe appearedwhile the eye was still in motion this motion might mask the sud-den onset of the probe causing an ineffective direction of attentionto the desired location Our criteria for stability required that theeye be moving at a velocity less than 12 degsec for at least 50 msecfollowing the end of the critical saccade If the eye exceeded this ve-locity threshold at any time during the 50 msec the counter wouldbe reset and another delay would be initiated As a result of thesestability criteria the actual delay between study scene offset andprobe onset averaged 153 msec Following the postscene delay thetarget was designated by a perfectly valid spatial probe flashed fora brief 50 msec (Figure 2B) The probe was a white bar 075ordm inlength positioned along an imaginary line passing through the fix-ation point and the center of the target object The distance betweenthe fixation point and the nearest point on the probe was 87ordm plac-ing the probe just beyond the region described by the target bound-ing box (ie the target and the probe did not spatially overlap) A2-sec blank interval followed the offset of the probe after whichappeared a seven-alternative forced-choice response grid (Fig-ure 2C) The grid depicted the seven study objects in f ixed displaylocations (eg the teddy bear always appeared in the upper-left po-sition of the arch-shaped configuration) Object identity wasmapped consistently to display position so as to minimize the needfor the subjects to search for the target during response The ob-serversrsquo task was to select the object in the response grid indicated

884 IRWIN AND ZELINSKY

by the spatial probe Because the observers were on a bite-bar andcould not easily speak their method of indicating a selection wassimply to look at the desired object an act that caused a white boxto be drawn around the fixated item and then to press a hand-heldbutton when satisfied with their selection There was no accuracyfeedback and the time commitment per subject was less than 2 h

DesignEach observer participated in 147 trials with these trials evenly

divided into three randomly interleaved fixation conditions (1 3 or5 in Version 1 of the experiment and 3 9 or 15 in Version 2 of theexperiment) Postexperiment questioning revealed that the naiveobservers were unaware of these variable scene presentation timesbeing contingent upon their oculomotor behavior Within each con-dition the spatial probe and therefore the target appeared equallyoften (seven times) at each of the seven display locations Each ob-ject also served as the target seven times resulting in 147 trials (3fixation conditions 3 7 locations 3 7 objects) Except for the aboveconstraints the seven objects in each scene were randomly assignedto the seven locations

Eye Movement RecordingA Fourward Technologies Generation V dual Purkinje-image

eyetracker interfaced with a Tecmar 8-bit 12-channel analog-to-digital converter was used to sample eye position every millisecondthroughout each trial The spatial precision of this tracker duringfixation is 61 min of visual angle To achieve this level of preci-sion a dental impression bite-bar was constructed for each subjectwho was asked to remain on the bite-bar until scheduled breaksafter every 30 trials Prior to the start of the experiment and againbefore every block of trials the observers performed a calibration

procedure consisting of accurately fixating each of nine targets de-marcating the 18ordm 3 116ordm field of view As an internal validitycheck the calibration procedure would not terminate until repeatedfixations on each target resulted in an eye position error of less than03ordm Other than these calibration instructions and a reminder tolook at the fixation cross whenever it appeared on the screen theobservers were not told how to direct their gaze in this experiment Saccadic eye movements were extracted on line using a velocity-based algorithm implementing a roughly 125 degsec detectionthreshold Fixation duration was defined as the period between theoffset of one saccade and the onset of the next Fixations that fellwithin the virtual 24ordm 3 24ordm bounding box within which each ob-ject was presented were scored as being on the object

RESULTS AND DISCUSSION

Discarded DataTrial data were deleted from analysis if the eye was not

within the fixation box when the trial started or if the eye-tracker lost track of the eye during scene blank or probepresentation The proportion of deleted trials did not varysignificantly across conditions in Version 1 of the experi-ment 187 of the trials were deleted from the 1-fixationcondition 170 from the 3-f ixation condition and177 from the 5-fixation condition A procedural changewas implemented in Version 2 of the experiment in orderto reduce the number of trials lost owing to inaccurate fix-ation on the fixation cross that preceded scene presenta-

Figure 1 Grayscale reproduction of one of the scenes used in the experiment

EYE MOVEMENTS AND MEMORY 885

tion In particular to make certain that the subjects wereindeed looking at this cross the study scene would not ap-pear until the gaze deviated from the center of the cross byless than 1ordm at the time of the buttonpress This precautioneliminated the need to discard trials caused by anticipa-

tory saccades to the fixed display locations Thus trialdata were deleted from analysis in Version 2 of the exper-iment only if the eyetracker lost track of the eye duringscene blank or probe presentation The proportion ofdeleted trials in Version 2 did not vary significantly across

Figure 2 Highlights of the experimental procedure

A

B

C

886 IRWIN AND ZELINSKY

conditions 54 of the trials were deleted from the3-fixation condition 109 from the 9-fixation condi-tion and 92 from the 15-fixation condition

Scene Viewing Time and Number of ObjectsFoveated

Because the duration of the display on each trial wasdetermined by fixation condition scene viewing time in-creased as the number of fixations allowed on the sceneincreased (see Table 1) The mean viewing times rangedfrom 174 msec in the 1-fixation condition to 4927 msecin the 15-fixation condition As one might expect thenumber of objects foveated in the scene also increasedwith the number of fixations allowed on the scene In the1-fixation condition the scene was erased as soon as theeyes left the fixation box at the bottom of the scene sono objects were foveated in this condition At the otherextreme an average of 51 objects were foveated in the15-fixation condition (see Table 1) The number of ob-jects foveated in each condition was less than the num-ber of fixations because some fixations fell on no objectand some objects were foveated more than once Meanfixation duration tended to increase as the number of fix-ations on the scene increased (see Figure 3) The first 2fixations on the scene were quite short (approximately150 msec in duration) whereas the duration of subse-quent fixations was captured reasonably well (r2 5 77)by the following equation fixation duration 5 2376 155(fixation number)

Given these differences in viewing time and in the num-ber of objects foveated it would be reasonable to expectthat overall accuracy in reporting the probed object wouldincrease across fixation conditions This is discussed next

Overall AccuracyIn Version 1 of the experiment mean accuracy was

360 in the 1-fixation condition 658 in the 3-fixationcondition and 672 in the 5-fixation condition The ef-fect of f ixation condition was significant [F(210) 5419 MSe 5 446 p 001] A planned comparison basedon the error term of the analysis of variance (ANOVA)showed that differences of 78 were significant at the05 level thus all pairwise comparisons were significantexcept for the difference between the 3-fixation condi-tion and the 5-fixation condition In Version 2 of the ex-periment mean accuracy was 545 in the 3-fixationcondition 707 in the 9-fixation condition and 777in the 15-fixation condition The effect of fixation con-dition was significant [F(210) 5 288 MSe 5 294 p 001] A planned comparison based on the error term ofthe ANOVA showed that differences of 63 were sig-nificant at the 05 level thus all pairwise comparisonswere significant A t test comparing the two 3-fixationconditions was not significant [t(10) 5 21 p 05]Mean accuracy at reporting the probed object as a func-tion of the number of fixations on the scene is shown inFigure 4 (the data from the 3-fixation conditions werecombined for this figure)

A number of secondary analyses were done to inves-tigate whether practice or learning had any effect on ac-curacy Because the same objects appeared in differentpositions on every trial it seemed possible that the mem-ory of where an object appeared on trial n 2 1 might in-fluence recall of where that object had appeared on trialn (eg because of proactive interference) There was noevidence that accuracy declined over trials however InVersion 1 of the experiment mean accuracy during thefirst second and last third of the experiment was 527

Table 1Viewing Time and Number of Objects Foveated as a Function

of Fixation Condition

Fixation Condition Viewing Time (msec) Objects Foveated

1 174 0003 739 1255 1360 2689 2810 347

15 4927 510

Figure 3 Mean fixation duration as a function of number of fixations on the scene

EYE MOVEMENTS AND MEMORY 887

551 and 589 In Version 2 of the experiment thecorresponding accuracies were 683 671 and683 There was also no evidence that probing thesame target object (eg the duck) in different positionson successive trials influenced performance accuracywas 626 when the target object on trial n had also beenthe target object (but in a different position) on trial n 21 and accuracy was 621 when different target objectswere probed on trial n and trial n 2 1 (there were no tri-als in which the same target object appeared in the same

probed position on successive trials) In sum there wasno evidence that learning or repetition effects interferedwith memory recall

The analyses that follow were designed to illuminatewhat factors did affect overall accuracy

Accuracy 3 Probe PositionPrevious partial-report experiments have found that

accuracy depends strongly on an itemrsquos position in a dis-play (eg Mewhort Campbell Marchetti amp Campbell

Fixation Condition

Per

cent

age

Cor

rect

100

90

80

70

60

50

40

30

20

10

01 3 5 9 15

Figure 4 Mean accuracy as a function of the number of fixations on the scene (error barsrepresent the standard error of the mean)

1 fixation3 fixations5 fixations9 fixations15 fixations

Position

Per

cent

age

Cor

rect

100

90

80

70

60

50

40

30

20

10

01 2 3 4 5 6 7

Figure 5 Mean accuracy as a function of probe position for each fixation condition

888 IRWIN AND ZELINSKY

1981) Figure 5 shows accuracy as a function of whichposition in the scene was probed for each fixation con-dition Position 1 refers to the leftmost object in thescene Position 7 to the rightmost and Position 4 to theobject directly above the fixation point Accuracy in theone-fixation condition showed a U-shaped function withposition so that objects at the sides of the display wereremembered better than objects at other array positionsThere are probably several factors contributing to this re-sult There is less lateral masking for the two end itemsprevious researchers have found that both retinal acuityand visual attention gradients are not circular but ratherare elongated in the horizontal dimension (eg Pan ampEriksen 1993) and subjects may have shifted their at-tention preferentially to the end items as part of a high-level scanning strategy

Of more interest is what happened to the accuracy 3probe position function when more fixations were madeon the display Recall for example that overall accuracyincreased as the number of fixations increased from 1 to3 fixations on the scene Figure 5 shows that not all arraypositions benefited equally however Rather accuracy forthe center-left items in the display increased more than ac-curacy for the center-right items in the display Increasingthe number of fixations from 3 to 5 had similar effects Incontrast accuracy for the rightmost items in the sceneincreased whereas accuracy for the leftmost items in thescene decreased when the number of fixations on thescene increased from 5 to 9 and from 9 to 15 In generalFigure 5 shows that peak accuracy for positions in thearray occurs early for positions on the left side of thearray and late for positions on the right side of the array

It seemed possible that the changes in accuracy acrossprobe position with increasing number of fixations onthe scene might be due to the subjectsrsquo viewing strategies

Figure 6 shows the percentage of fixations on different ob-ject positions in the scene as a function of fixation num-ber (eg 33 of all second fixations fell on Position 13 of all second fixations fell on Position 2 and so on)1In general the pattern of fixations looks very similar tothe pattern of accuracies shown in Figure 5 The subjectstended to fixate the leftmost items in the array early inscene viewing but shifted to rightward positions later inscene viewing There was a great deal of variability be-tween and even within subjects in terms of their viewingstrategies however Two subjects (1 in Version 1 of theexperiment and 1 in Version 2) fixated the objects fromleft to right on most trials 1 subject fixated the objectsfrom right to left on most trials 1 subject moved from leftto right on half the trials and right to left on the other half1 usually fixated Position 1 then Position 4 then Posi-tion 7 skipping over the other positions 1 usually fixatedPosition 4 first and then moved left 1 usually fixated Po-sition 4 first and then moved right and the remainderused a mixture of these strategies changing from one toanother during the course of the experiment There waslittle difference in accuracy between the 3 subjects whoconsistently used a left-to-right or a right-to-left strategy(591) and the 9 who did not (628)

The similarity between the pattern of accuracies in Fig-ure 5 and the pattern of eye fixations in Figure 6 suggeststhat accuracy was affected by whether (and perhaps when)the probed item was foveated during scene presentationThis was examined directly in the next two analyses

Accuracy for Foveated Versus NonfoveatedObjects

On some trials the subjects foveated the object thathappened to be probed whereas on other trials they didnot Figure 7 shows accuracy for foveated versus non-

Position

Per

cent

age

of F

ixat

ions

40

35

30

25

20

15

10

5

01 2 3 4 5 6 7

Fix 2

Fix 3

Fix 4

Fix 5

Fix 6

Fix 7

Fix 8

Fix 9

Fix 10

Fix 11

Fix 12

Fix 13

Fix 14

Fix 15

Figure 6 The percentage of fixations on each object position as a function of fixation number

EYE MOVEMENTS AND MEMORY 889

foveated targets as a function of the number of fixationson the scene No targets were foveated in the 1-fixationcondition because the scene disappeared during the firstsaccade In the 3- 5- and 9-fixation conditions how-ever accuracy was much higher if the position that wasprobed had been foveated as compared with when it hadnot been foveated during scene presentation [929 vs530 in the 3-fixation case t(11) 5 76 p 001842 vs 565 in the 5-fixation case t(5) 5 26 p 05 858 vs 559 in the 9-fixation case t(5) 5 53p 005] The benefit of foveation (808 vs 744)was not significant when 15 fixations had been made onthe scene however [t(5) 5 10 p 35] An inspectionof Figure 7 suggests that this occurred for two reasonsFirst accuracy for nonfoveated objects increased sub-

stantially (from 530 to 744) as the number of fixa-tions on the scene increased from 3 to 15 presumablybecause with an increasing number of f ixations theeyes were landing near the nonfoveated objects even ifthey were not landing on them second accuracy forfoveated objects decreased as the number of fixations in-creased This suggests that recency of fixation on the tar-get object might also be important (ie the 15-fixationcondition includes fixations on the target further back intime than does the 3-fixation condition) This was ex-amined in the next analysis

Accuracy as a Function of Fixation RecencyFigure 8 shows accuracy for foveated targets only as a

function of when during the trial the object had been

Number of Fixations

Per

cent

age

Cor

rect

100

90

80

70

60

50

40

30

20

10

0

1 3 5 9 15

FoveatedNonfoveated

Figure 7 Mean accuracy for foveated versus nonfoveated targets as a function of thenumber of fixations on the scene (an error bar represents the standard error of themean)

Recency of Fixation

Per

cent

age

Cor

rect

100

90

80

70

60

50

40

30

20

10

00 ndash1 ndash2 ndash3 ndash 4 ndash 5

Figure 8 Mean accuracy for foveated targets as a function only of when during the trialthe target had been foveated (an error bar represents the standard error of the mean)

890 IRWIN AND ZELINSKY

foveated If the object that had been foveated just beforethe critical saccade (ie the saccade that terminated thescene) happened to be probed for report (denoted as fix-ation recency 0 in the figure) the subjects reported itcorrectly on over 90 of the trials If the object probedfor report had been foveated 1 or 2 fixations earlier (cor-responding to fixation recencies 21 and 22 in the fig-ure) accuracy was slightly lower (approximately 80)Probing an object that had been foveated 3 or more fix-ations before the critical saccade resulted in accuraciesthat were considerably lower (approximately 65) andnot much different from the accuracy obtained when anonfoveated object had been probed for report (59)The effect of fixation recency (using six levels of re-cency) was significant [F(525) 5 285 MSe 5 2432p 05] in a one-way repeated measures ANOVA thatwas conducted on the data from Version 2 of the experi-ment (this version allowed 3 9 or 15 fixations on thedisplay and thus provided a wide range of fixation re-cencies) A planned comparison based on the error termof the ANOVA showed that differences of 145 weresignificant at the 05 level thus accuracy for the threeobjects foveated just before the critical saccade (iewith fixation recencies of 0 21 and 22) was signifi-cantly higher than accuracy for objects foveated 3 ormore fixations back These results suggest that fixationrecency does influence accuracy in particular it appearsthat people remember the last three objects foveated inthe scene much better than they remember objectsfoveated earlier in time (see Zelinsky amp Loschky 1998for similar results) There was no evidence of a primacyeffect if the object probed for report was the first objectfoveated (and had been foveated 3 or more fixations be-fore the critical saccade) accuracy was 66

Accuracy as a Function of the Number ofObjects Foveated

Given the importance of foveation to scene memorythat is revealed by the preceding analyses one might ex-pect that overall accuracy would depend on the numberof unique objects that were foveated in the scene (recallthat 7 objects appeared in the scene) This relationship isshown in Figure 9 Accuracy increased as 0ndash6 objectswere foveated but there was no additional increase in ac-curacy from 6 to 7 objects Accuracy peaked at 80 evenwhen 6 or 7 unique objects were foveated An accuracylevel of 80 corresponds to memory for 54 objects inthe scene2 This result suggests that scene representa-tions contain no more than about f ive object positionpairings regardless of the number of f ixations on thescene This is not to say that 5 discrete objects in all theirfull detail are stored on every trial however rather thisestimate represents the average amount of informationstored from what is most likely a stochastic processThus it is probably most accurate to say that approxi-mately 5 objectsrsquo ldquoworthrdquo of information is stored in theon-line scene representation

Accuracy When Final Saccade Was Directed atthe Target Versus Directed Elsewhere

Previous research has shown that movements of atten-tion precede movements of the eyes to locations in spacethereby facilitating the processing of information that ispresent at the saccade target location (eg Deubel ampSchneider 1996 Henderson 1993 Henderson Pollat-sek amp Rayner 1989 Hoffman amp Subramaniam 1995Irwin amp Gordon 1998 Klein 1980 Klein amp Pontefract1994 Kowler Anderson Dosher amp Blaser 1995Rayner McConkie amp Ehrlich 1978 Shepherd Findlay

0 1 2 3 4 5 6 7

100

90

80

70

60

50

40

30

20

10

0

Number of Objects Foveated

Per

cent

age

Cor

rect

Figure 9 Mean accuracy as a function of the number of unique objects that were foveatedduring scene presentation (an error bar represents the standard error of the mean)

EYE MOVEMENTS AND MEMORY 891

amp Hockey 1986) Thus we expected that accuracywould be higher when the critical saccade was directedat the object that happened to be probed on any giventrial as compared with when the critical saccade was di-rected toward an unprobed object To elaborate considera one-fixation trial in which the baby bottle was pre-sented in the leftmost position of the display Supposethat the subject moved hisher eyes from the central fix-ation box toward this position Because this was a one-fixation trial the scene disappeared as soon as the sac-cade was initiated to the bottle so nothing was presenton the screen when the eyes landed A short time laterthe probe was presented at one of the display positionsand on some trials it just happened to be presented at thelocation to which the eyes were sent on the critical sac-cade the question of interest is whether accuracy washigher when the probed item happened to be the itemthat the eyes were sent to on the critical saccade thanwhen it was not

Figure 10 shows accuracy for trials in which theprobed item was or was not the target of the critical sac-cade in each fixation condition (by target of the saccadewe mean that the eyes landed within the virtual 24ordmsquare bounding box that defined object position) Onlytrials in which the probed item was not fixated at anytime during scene presentation are included In everyfixation condition accuracy was much higher when theprobed item was the target of the final saccade than whenit was not [778 vs 309 in the 1-f ixation caset(5) 5 26 p 05 929 vs 515 in the 3-fixationcase t(11) 5 60 p 001 879 vs 513 in the 5-fixation case t(5) 5 36 p 02 998 vs 658 in the9-fixation case t(5) 5 102 p 001 and 983 vs751 in the 15-fixation case t(5) 5 49 p 005]

In sum accuracy at reporting the probed item wasvery high if the probe happened to be presented at the lo-cation to which the eyes had been sent on the critical sac-cade even though the object at that location had neverbeen foveated during the trial merely being the target ofthe final saccade increased its memorability consider-ably (see Henderson amp Hollingworth 1999b for similarfindings) This presumably occurred because attentionpreceded the eyes to the saccade target location The ef-fect of attention is to increase the likelihood that infor-mation at the saccade target location will be encodedinto onersquos mental representation of the scene In fact itis possible that many of the beneficial effects of foveat-ing an object discussed above are not due to foveationper se but rather to the attention shift that preceded it

DISCUSSION

In the last 20 years there has been an explosion of in-terest in the properties of on-line scene representation(eg Aginsky amp Tarr 2000 Blackmore et al 1995De Graef 1992 Grimes 1996 Hayhoe 2000 Hender-son amp Hollingworth 1999a 1999b Hollingworth ampHenderson 2000 2002 Hollingworth et al 2001 In-traub 1997 Irwin 1991 1992 Irwin amp Andrews 1996Irwin et al 1988 Irwin et al 1983 Irwin Zacks ampBrown 1990 Levin amp Simons 1997 McConkie amp Cur-rie 1996 Mondy amp Coltheart 2000 OrsquoRegan 1992OrsquoRegan amp Levy-Schoen 1983 Pringle Irwin Krameramp Atchley 2001 Rayner amp Pollatsek 1983 Rensink2000 Rensink et al 1997 Simons 1996 2000 Simonsamp Levin 1998 Wallis amp Bulthoff 2000) One commonfinding is that scene representations appear to be sparseand undetailed Consistent with previous research the

1 3 5 9

100

90

80

70

60

50

40

30

20

10

0

Number of Fixations

Per

cent

age

Cor

rect

15

DirectedNot Directed

Figure 10 Mean accuracy when the final saccade was directed at the target versusnot directed at the target as a function of the number of fixations on the scene (anerror bar represents the standard error of the mean)

892 IRWIN AND ZELINSKY

results of our experiment also indicate that scene repre-sentations are relatively sparse even after 15 fixationson a scene our subjects remembered the positionidentitypairings of only about 78 of the objects in our seven-object displays or the equivalent of about five objectsrsquoworth of information But more important our resultsprovide new information about what factors influencethe contents of the on-line scene representation and howthis representation changes during the course of sceneviewing In particular our analyses suggest that the lastthree items that are foveated and the item about to befoveated are actually remembered quite well much bet-ter than other items in a scene In other words on-linescene representations are dynamic so that the memorytraces of recently fixated objects are much stronger thanthose of objects viewed several fixations back in timeInformation about a scene does appear to accumulateover multiple fixations but the capacity of the on-linescene representation appears to be limited to about fiveitems In most respects the results of the present studyare consistent with the results of other studies that haveused the transsaccadic partial-report technique to inves-tigate integration across saccades with nonscene dis-plays (eg Irwin 1992 Irwin amp Andrews 1996 Irwinamp Gordon 1998)

What is the nature of the memory representation under-lying performance in our task Given that the same sevenobjects appeared in fixed positions on each trial onemight imagine that subjects would simply adopt a left-to-right reading strategy and would store an ordered list ofobject names in verbal working memory Then when theprobe was presented they would scan their memorizedlist and report the appropriate item Three aspects of ourdata seem inconsistent with this hypothesis howeverFirst as was noted earlier for the most part the subjectsdid not use a left-to-right reading strategy nor did theyeven adopt any consistent strategy across trials Thisvariability would greatly complicate memory retrieval ifthe subjects had stored an ordered list of object names inverbal working memory because there would be no con-sistent mapping between probe position and the positionof an item in verbal working memory That is if thememory probe appeared at Position 6 say this wouldcorrespond to the sixth position in the list if the subjectscanned the array from left to right the second positionin the list if the subject scanned from right to left thethird position if the subject started in the middle and thenmoved right and so on The fact that the subjects did notadopt a fixed scanning order suggests to us that theywere not storing an ordered list of object names in ver-bal working memory Second the fact that the subjectsremembered only a maximum of five objects from thedisplay also seems inconsistent with the verbal workingmemory hypothesis because the capacity of verbalworking memory is generally assumed to be betweenfive and nine items given 5 sec to study the display wewould think that the subjects would be able to remember

more than five items if they were using verbal workingmemory Finally the absence of a primacy effect in ourdata also seems inconsistent with the verbal workingmemory hypothesis

Instead we believe that our results are more consistentwith the object-f ile theory of transsaccadic memory(Irwin 1992 1996 Irwin amp Andrews 1996) This theoryis based on the theoretical framework for object percep-tion proposed by Kahneman and Treisman (1984) andKahneman Treisman and Gibbs (1992) It contains fourlevels of representation feature maps which register in-dependently the presence of different sensory features ina scene such as color and shape a master map of loca-tions which registers where in a scene features are lo-cated temporary object representations called objectfiles which are formed when features are conjoined intounitary wholes via attention and which thus representwhat objects are located where in a scene and an abstractlong-term recognition network that stores descriptions ofobjects along with their names Object files may containvisual verbal and semantic information about objectsbecause they have access to long-term memory as well asto perceptual input According to the object-file theoryof transsaccadic memory relatively little information ispreserved across eye movements Rather transsaccadicmemory (and by extension the on-line scene representa-tion) consists of 3mdash5 object files that contain positionand identity information for items that have been at-tended in the scene and of residual activation in the fea-ture maps and in the long-term recognition network Theresults of the present study are consistent with the object-file theory because even after 15 fixations on a scenememory for the scene was limited and consisted largelyof the last few objects that had been attended in the scene

Although the results of the present study are consis-tent with the object-file theory they seem not entirelyconsistent with another theory of scene representationproposed recently by Rensink (2000) According toRensinkrsquos coherence theory in the absence of focusedattention volatile low-level proto-objects are continuallybeing formed and replaced rapidly and in parallel acrossthe visual field reflecting whatever is present on theretina at any moment in time Focused attention providesa coherence field that selects a small number of theseproto-objects to form a stable individuated object thathas spatiotemporal continuity This stable object doesnot necessarily correspond to a single element in the vi-sual field however but may represent a supra-object (cfPylyshynrsquos FINSTs Pylyshyn amp Storm 1988) After fo-cused attention is released the object loses its coherenceand dissolves back into its constituent proto-object partsOnce that occurs there is little or no consequence of hav-ing been attended According to coherence theory visualshort-term memory corresponds to the coherence fieldit contains object tokens (ie specific instantiations at aparticular place and time) that are currently in the focusof attention but that disappear when attention is with-

EYE MOVEMENTS AND MEMORY 893

drawn The coherence theory does allow for the exis-tence of a nonvisual short-term memory for previouslyattended items but it posits that this memory containsinformation only about object types (general definitionsabstracted away from specific spatiotemporal contexts)

The results of the present study seem to raise problemsfor the coherence theory because object tokens were re-tained in memory for recently fixated objects that wereno longer in the focus of attention Consistent with co-herence theory we found that accuracy was very highwhen the object at the final saccade target location(hence the object at the focus of attention) was probedfor report However we also found that accuracy wasquite high for the three objects that had been fixated justprior to this Since attention movements precede eyemovements in an obligatory fashion the three objectsviewed in prior fixations could not also be at the focus ofattention The fact that accuracy was not uniform for thefour objects in question also indicates that they were notall simultaneously at the focus of attention Rather theobjects viewed during the fixations just preceding thefinal saccade benefited from prior attentional process-ing This is inconsistent with the claim that once focusedattention is released objects dissolve back into their con-stituent proto-objects with no consequence of havingbeen attended Note also that object tokens and not ob-ject types were being remembered in our experiment inorder to respond correctly to the partial-report probe thesubjects had to remember the position and the identityof the probed itemmdashthat is they had to remember an ob-ject token Merely remembering that an object type hadappeared somewhere in the display would not lead to anaccurate response especially since the same seven itemsappeared on every trial

Although the present study provides important infor-mation about the factors that influence the contents of on-line scene representations two limitations should benoted One is that our study measured only short-termmemory contributions to the on-line scene representationand not potential long-term memory contributions Thesame seven objects appeared in the same scene context onevery trial so top-down contributions to performancewere essentially eliminated In the real world scene rep-resentations are bound to be influenced by knowledgeabout what kinds of objects are typically found in a par-ticular scene context and where those objects are usuallylocated Indeed recent research by Hollingworth andHenderson (2002 see also Hollingworth et al 2001) inwhich a novel saccade-contingent change detection pro-cedure was used has demonstrated that long-term mem-ory plays a crucial role in the construction and mainte-nance of on-line scene representations

A second limitation of our study is that it involvedonly a test of explicit memory Recently several investi-gators have found evidence that explicit tests of scenememory may underestimate the amount of informationin the representation (eg Fernandez-Duque amp Thornton2000 Hollingworth et al 2001 Williams amp Simons

2000) For example Hollingworth et al found that chang-ing an object in a scene often affected fixation durationon that object even though subjects failed to report thata change had occurred This suggests that at some levelthe perceptual system detected the change even thoughthe subjects were not explicitly aware of it It is impor-tant to note that the implicit effects that have been ob-served have often been small however Furthermoreeven though implicit measures of performance (eg eyemovements) appear to be more sensitive to change thanare explicit measures it does not follow from this thatscene representations contain a great deal more infor-mation than has been measured by explicit tests Explicittests may underestimate the amount of information in thescene representation but there is no evidence that theyunderestimate it by much This is still an open question

REFERENCES

Aginsky V amp Tarr M (2000) How are different properties of a sceneencoded in visual memory Visual Cognition 7 147-162

Averbach E amp Coriell E (1961) Short-term memory in visionBell System Technical Journal 40 309-328

Blackmore S J Brelstaff G Nelson K amp Troscianko T(1995) Is the richness of our visual world an illusion Transsaccadicmemory for complex scenes Perception 24 1075-1081

Bridgeman B Hendry D amp Stark L (1975) Failure to detect dis-placement of the visual world during saccadic eye movements Vi-sion Research 15 719-722

Bridgeman B amp Mayer M (1983) Failure to integrate visual infor-mation from successive fixations Bulletin of the Psychonomic Soci-ety 21 285-286

Bridgeman B van der Heijden A amp Velichkovsky B (1994) Atheory of visual stability across saccadic eye movements Behavioralamp Brain Sciences 17 247-292

Busey T amp Loftus G (1994) Sensory and cognitive components ofvisual information acquisition Psychological Review 101 446-469

Currie C B McConkie G W Carlson-Radvansky L A ampIrwin D E (2000) The role of the saccade target object in the per-ception of a visually stable world Perception amp Psychophysics 62673-683

De Graef P (1992) Scene-context effects and models of real-worldperception In K Rayner (Ed) Eye movements and visual cognitionScene perception and reading (pp 243-259) New York Springer-Verlag

Deubel H amp Schneider W X (1996) Saccade target selection andobject recognition Evidence for a common attentional mechanismVision Research 36 1827-1837

Fernandez-Duque D amp Thornton I M (2000) Change detectionwithout awareness Do explicit reports underestimate the representa-tion of change in the visual system Visual Cognition 7 324-344

Grimes J (1996) On the failure to detect changes in scenes across sac-cades In K Akins (Ed) Perception (Vancouver Studies in CognitiveScience Vol 5 pp 89-110) New York Oxford University Press

Hayhoe M M (2000) Vision using routines A functional account ofvision Visual Cognition 7 43-64

Henderson J M (1993) Visual attention and saccadic eye move-ments In G drsquoYdewalle amp J Van Rensbergen (Eds) Perception andcognition Advances in eye-movement research (pp 37-50) Amster-dam North-Holland

Henderson J M (1997) Transsaccadic memory and integration dur-ing real-world object perception Psychological Science 8 51-55

Henderson J M amp Hollingworth A (1999a) High-level sceneperception Annual Review of Psychology 50 243-271

Henderson J M amp Hollingworth A (1999b) The role of fixationposition in detecting scene changes across saccades PsychologicalScience 10 438-443

894 IRWIN AND ZELINSKY

Henderson J M Pollatsek A amp Rayner K (1989) Covert visualattention and extrafoveal information use during object identifica-tion Perception amp Psychophysics 45 196-208

Hochberg J (1986) Representation of motion and space in video andcinematic displays In K R Boff L Kaufman amp J P Thomas (Eds)Handbook of perception and human performance Vol 1 Sensoryprocesses and perception (pp 221-2264) New York Wiley

Hoffman J E amp Subramaniam B (1995) The role of visual atten-tion in saccadic eye movements Perception amp Psychophysics 57787-795

Hollingworth A amp Henderson J M (2000) Semantic informa-tiveness mediates the detection of changes in natural scenes VisualCognition 7 213-235

Hollingworth A amp Henderson J M (2002) Accurate visualmemory for previously attended objects in natural scenes Journal ofExperimental Psychology Human Perception amp Performance 28113-136

Hollingworth A Williams C C amp Henderson J M (2001) Tosee and remember Visually specific information is retained in mem-ory from previously attended objects in natural scenes PsychonomicBulletin amp Review 8 761-768

Intraub H (1997) The representation of visual scenes Trends in Cog-nitive Sciences 1 217-222

Irwin D E (1991) Information integration across saccadic eye move-ments Cognitive Psychology 23 420-456

Irwin D E (1992) Memory for position and identity across eye move-ments Journal of Experimental Psychology Learning Memory ampCognition 18 307-317

Irwin D E (1996) Integrating information across saccadic eye move-ments Current Directions in Psychological Science 5 94-100

Irwin D E amp Andrews R V (1996) Integration and accumulationof information across saccadic eye movements In T Inui amp J L Mc-Clelland (Eds) Attention and performance XVI Information inte-gration in perception and communication (pp 125-155) CambridgeMA MIT Press

Irwin D E Brown J S amp Sun J-S (1988) Visual masking and vi-sual integration across saccadic eye movements Journal of Experi-mental Psychology General 117 276-287

Irwin D E amp Gordon R D (1998) Eye movements attention andtranssaccadic memory Visual Cognition 5 127-155

Irwin D E Yantis S amp Jonides J (1983) Evidence against visualintegration across saccadic eye movements Perception amp Psycho-physics 34 49-57

Irwin D E Zacks J L amp Brown J S (1990) Visual memory andthe perception of a stable visual environment Perception amp Psycho-physics 47 35-46

Kahneman D amp Treisman A (1984) Changing views of attentionand automaticity In R Parasuraman amp D R Davies (Eds) Varietiesof attention (pp 29-61) Orlando FL Academic Press

Kahneman D Treisman A amp Gibbs B J (1992) The reviewing ofobject f iles Object-specific integration of information CognitivePsychology 24 175-219

Klein R (1980) Does oculomotor readiness mediate cognitive controlof visual attention In R S Nickerson (Ed) Attention and perfor-mance VIII (pp 259-276) Hillsdale NJ Erlbaum

Klein R amp Pontefract A (1994) Does oculomotor readiness me-diate cognitive control of visual attention Revisited In C Umiltagrave ampM Moskovitch (Eds) Attention and performance XV Consciousand nonconscious information processing (pp 333-350) CambridgeMA MIT Press Bradford Books

Kowler E Anderson E Dosher B amp Blaser E (1995) The roleof attention in the programming of saccades Vision Research 351897-1916

Levin D T amp Simons D J (1997) Failure to detect changes to at-tended objects in motion pictures Psychonomic Bulletin amp Review4 501-506

Matin E (1974) Saccadic suppression A review and an analysis Psy-chological Bulletin 81 899-917

McConkie G W amp Currie C B (1996) Visual stability across sac-cades while viewing complex pictures Journal of Experimental Psy-chology Human Perception amp Performance 22 563-581

McConkie G W amp Zola D (1979) Is visual information integratedacross successive fixations in reading Perception amp Psychophysics25 221-224

Mewhort D J K Campbell A J Marchetti F M amp CampbellJ I D (1981) Identif ication localization and ldquoiconic memoryrdquo Anevaluation of the bar-probe task Memory amp Cognition 9 50-67

Mondy S amp Coltheart V (2000) Detection and identification ofchange in naturalistic scenes Visual Cognition 7 281-296

OrsquoRegan J K (1992) Solving the ldquorealrdquo mysteries of visual percep-tion The world as an outside memory Canadian Journal of Psy-chology 46 461-488

OrsquoRegan J K amp Levy-Schoen A (1983) Integrating visual infor-mation from successive fixations Does trans-saccadic fusion existVision Research 23 765-768

Pan K amp Eriksen C W (1993) Attentional distribution in the visualfield during samendashdifferent judgments as assessed by response com-petition Perception amp Psychophysics 53 134-144

Pashler H (1988) Familiarity and visual change detection Percep-tion amp Psychophysics 44 369-378

Phillips W A (1974) On the distinction between sensory storage andshort-term visual memory Perception amp Psychophysics 16 283-290

Pollatsek A Rayner K amp Henderson J (1990) The role of spa-tial location in the integration of pictorial information across sac-cades Journal of Experimental Psychology Human Perception ampPerformance 16 199-210

Pringle H L Irwin D E Kramer A F amp Atchley P (2001)The role of attentional breadth in perceptual change detection Psy-chonomic Bulletin amp Review 8 89-95

Pylyshyn Z W amp Storm R W (1988) Tracking multiple indepen-dent targets Evidence for a parallel tracking mechanism Spatial Vi-sion 3 179-197

Rayner K McConkie G W amp Ehrlich S (1978) Eye movement sand integrating information across f ixations Journal of Experimen-tal Psychology Human Perception amp Performance 4 529-544

Rayner K McConkie G W amp Zola D (1980) Integrating infor-mation across eye movements Cognitive Psychology 12 206-226

Rayner K amp Pollatsek A (1983) Is visual information integratedacross saccades Perception amp Psychophysics 34 39-48

Rensink R A (2000) The dynamic representation of scenes VisualCognition 7 17-42

Rensink R A OrsquoRegan J K amp Clark J J (1997) To see or not tosee The need for attention to perceive changes in scenes Psycho-logical Science 8 368-373

Shepherd M Findlay J amp Hockey R (1986) The relationship be-tween eye movements and spatial attention Quarterly Journal of Ex-perimental Psychology 38A 475-491

Simons D J (1996) In sight out of mind When object representa-tions fail Psychological Science 7 301-305

Simons D J (2000) Current approaches to change blindness VisualCognition 7 1-15

Simons D J amp Levin D T (1997) Change blindness Trends in Cog-nitive Sciences 1 261-267

Simons D J amp Levin D T (1998) Failure to detect changes to peo-ple during a real-world interaction Psychonomic Bulletin amp Review5 644-649

Sperling G (1960) The information available in brief visual presen-tations Psychological Monographs 74(11 Whole No 498)

Van Dijk T amp Kintsch W (1983) Strategies of discourse compre-hension New York Academic Press

Wallis G amp Bulthoff H (2000) Whatrsquos scene and not seen Influ-ences of movement and task upon what we see Visual Cognition 7175-190

Williams P amp Simons D J (2000) Detecting changes in novel 3Dobjects Effects of change magnitude spatiotemporal continuity andstimulus familiarity Visual Cognition 7 297-322

Zelinsky G J (1999) Precuing target location in a variable set sizeldquononsearchrdquo task Dissociating search-based and interference-basedexplanations for set size effects Journal of Experimental Psychol-ogy Human Perception amp Performance 25 875-903

Zelinsky G J (2001) Eye movements during change detection Im-

EYE MOVEMENTS AND MEMORY 895

plications for search constraints memory limitations and scanningstrategies Perception amp Psychophysics 63 209-225

Zelinsky G J amp Loschky L (1998) Toward a realistic assessmentof visual working memory Investigative Ophthalmology amp VisualScience 39 S224

Zelinsky G J Rao R Hayhoe M amp Ballard D (1997) Eyemovements reveal the spatiotemporal dynamics of visual search Psy-chological Science 8 448-453

NOTES

1 The percentage of f ixations summed across positions in Figure 6does not add up to 100 because some fixations did not fall on any ob-ject position This happened relatively rarely for Fixations 3ndash15 rang-ing from 3 to 9 of all the fixations It happened quite often for Fix-ation 2 (the first real f ixation on the scene since Fixation 1 was on thefixation point) accounting for 45 of all the fixations Approximately

67 of these fell near an object position whereas the remainder werescattered randomly in the central area of the scene these may have beenldquocenter of massrdquo fixations like those previously observed by ZelinskyRao Hayhoe and Ballard (1997)

2 In order to determine how many objects would have to be stored inmemory to produce an accuracy level of 80 we first to correct forguessing (Busey amp Loftus 1994) applied the formula p 5 (x 2 g)(1 2g) where x is the raw proportion correct g is the guessing probabilityand p is the corrected proportion correct (note that g 5 143 or 17 be-cause there were always seven objects in the array) We then multipliedp by the number of objects in the display in order to estimate the num-ber of objects remembered (Sperling 1960)

(Manuscript received March 9 2001revision accepted for publication November 28 2001)

884 IRWIN AND ZELINSKY

by the spatial probe Because the observers were on a bite-bar andcould not easily speak their method of indicating a selection wassimply to look at the desired object an act that caused a white boxto be drawn around the fixated item and then to press a hand-heldbutton when satisfied with their selection There was no accuracyfeedback and the time commitment per subject was less than 2 h

DesignEach observer participated in 147 trials with these trials evenly

divided into three randomly interleaved fixation conditions (1 3 or5 in Version 1 of the experiment and 3 9 or 15 in Version 2 of theexperiment) Postexperiment questioning revealed that the naiveobservers were unaware of these variable scene presentation timesbeing contingent upon their oculomotor behavior Within each con-dition the spatial probe and therefore the target appeared equallyoften (seven times) at each of the seven display locations Each ob-ject also served as the target seven times resulting in 147 trials (3fixation conditions 3 7 locations 3 7 objects) Except for the aboveconstraints the seven objects in each scene were randomly assignedto the seven locations

Eye Movement RecordingA Fourward Technologies Generation V dual Purkinje-image

eyetracker interfaced with a Tecmar 8-bit 12-channel analog-to-digital converter was used to sample eye position every millisecondthroughout each trial The spatial precision of this tracker duringfixation is 61 min of visual angle To achieve this level of preci-sion a dental impression bite-bar was constructed for each subjectwho was asked to remain on the bite-bar until scheduled breaksafter every 30 trials Prior to the start of the experiment and againbefore every block of trials the observers performed a calibration

procedure consisting of accurately fixating each of nine targets de-marcating the 18ordm 3 116ordm field of view As an internal validitycheck the calibration procedure would not terminate until repeatedfixations on each target resulted in an eye position error of less than03ordm Other than these calibration instructions and a reminder tolook at the fixation cross whenever it appeared on the screen theobservers were not told how to direct their gaze in this experiment Saccadic eye movements were extracted on line using a velocity-based algorithm implementing a roughly 125 degsec detectionthreshold Fixation duration was defined as the period between theoffset of one saccade and the onset of the next Fixations that fellwithin the virtual 24ordm 3 24ordm bounding box within which each ob-ject was presented were scored as being on the object

RESULTS AND DISCUSSION

Discarded DataTrial data were deleted from analysis if the eye was not

within the fixation box when the trial started or if the eye-tracker lost track of the eye during scene blank or probepresentation The proportion of deleted trials did not varysignificantly across conditions in Version 1 of the experi-ment 187 of the trials were deleted from the 1-fixationcondition 170 from the 3-f ixation condition and177 from the 5-fixation condition A procedural changewas implemented in Version 2 of the experiment in orderto reduce the number of trials lost owing to inaccurate fix-ation on the fixation cross that preceded scene presenta-

Figure 1 Grayscale reproduction of one of the scenes used in the experiment

EYE MOVEMENTS AND MEMORY 885

tion In particular to make certain that the subjects wereindeed looking at this cross the study scene would not ap-pear until the gaze deviated from the center of the cross byless than 1ordm at the time of the buttonpress This precautioneliminated the need to discard trials caused by anticipa-

tory saccades to the fixed display locations Thus trialdata were deleted from analysis in Version 2 of the exper-iment only if the eyetracker lost track of the eye duringscene blank or probe presentation The proportion ofdeleted trials in Version 2 did not vary significantly across

Figure 2 Highlights of the experimental procedure

A

B

C

886 IRWIN AND ZELINSKY

conditions 54 of the trials were deleted from the3-fixation condition 109 from the 9-fixation condi-tion and 92 from the 15-fixation condition

Scene Viewing Time and Number of ObjectsFoveated

Because the duration of the display on each trial wasdetermined by fixation condition scene viewing time in-creased as the number of fixations allowed on the sceneincreased (see Table 1) The mean viewing times rangedfrom 174 msec in the 1-fixation condition to 4927 msecin the 15-fixation condition As one might expect thenumber of objects foveated in the scene also increasedwith the number of fixations allowed on the scene In the1-fixation condition the scene was erased as soon as theeyes left the fixation box at the bottom of the scene sono objects were foveated in this condition At the otherextreme an average of 51 objects were foveated in the15-fixation condition (see Table 1) The number of ob-jects foveated in each condition was less than the num-ber of fixations because some fixations fell on no objectand some objects were foveated more than once Meanfixation duration tended to increase as the number of fix-ations on the scene increased (see Figure 3) The first 2fixations on the scene were quite short (approximately150 msec in duration) whereas the duration of subse-quent fixations was captured reasonably well (r2 5 77)by the following equation fixation duration 5 2376 155(fixation number)

Given these differences in viewing time and in the num-ber of objects foveated it would be reasonable to expectthat overall accuracy in reporting the probed object wouldincrease across fixation conditions This is discussed next

Overall AccuracyIn Version 1 of the experiment mean accuracy was

360 in the 1-fixation condition 658 in the 3-fixationcondition and 672 in the 5-fixation condition The ef-fect of f ixation condition was significant [F(210) 5419 MSe 5 446 p 001] A planned comparison basedon the error term of the analysis of variance (ANOVA)showed that differences of 78 were significant at the05 level thus all pairwise comparisons were significantexcept for the difference between the 3-fixation condi-tion and the 5-fixation condition In Version 2 of the ex-periment mean accuracy was 545 in the 3-fixationcondition 707 in the 9-fixation condition and 777in the 15-fixation condition The effect of fixation con-dition was significant [F(210) 5 288 MSe 5 294 p 001] A planned comparison based on the error term ofthe ANOVA showed that differences of 63 were sig-nificant at the 05 level thus all pairwise comparisonswere significant A t test comparing the two 3-fixationconditions was not significant [t(10) 5 21 p 05]Mean accuracy at reporting the probed object as a func-tion of the number of fixations on the scene is shown inFigure 4 (the data from the 3-fixation conditions werecombined for this figure)

A number of secondary analyses were done to inves-tigate whether practice or learning had any effect on ac-curacy Because the same objects appeared in differentpositions on every trial it seemed possible that the mem-ory of where an object appeared on trial n 2 1 might in-fluence recall of where that object had appeared on trialn (eg because of proactive interference) There was noevidence that accuracy declined over trials however InVersion 1 of the experiment mean accuracy during thefirst second and last third of the experiment was 527

Table 1Viewing Time and Number of Objects Foveated as a Function

of Fixation Condition

Fixation Condition Viewing Time (msec) Objects Foveated

1 174 0003 739 1255 1360 2689 2810 347

15 4927 510

Figure 3 Mean fixation duration as a function of number of fixations on the scene

EYE MOVEMENTS AND MEMORY 887

551 and 589 In Version 2 of the experiment thecorresponding accuracies were 683 671 and683 There was also no evidence that probing thesame target object (eg the duck) in different positionson successive trials influenced performance accuracywas 626 when the target object on trial n had also beenthe target object (but in a different position) on trial n 21 and accuracy was 621 when different target objectswere probed on trial n and trial n 2 1 (there were no tri-als in which the same target object appeared in the same

probed position on successive trials) In sum there wasno evidence that learning or repetition effects interferedwith memory recall

The analyses that follow were designed to illuminatewhat factors did affect overall accuracy

Accuracy 3 Probe PositionPrevious partial-report experiments have found that

accuracy depends strongly on an itemrsquos position in a dis-play (eg Mewhort Campbell Marchetti amp Campbell

Fixation Condition

Per

cent

age

Cor

rect

100

90

80

70

60

50

40

30

20

10

01 3 5 9 15

Figure 4 Mean accuracy as a function of the number of fixations on the scene (error barsrepresent the standard error of the mean)

1 fixation3 fixations5 fixations9 fixations15 fixations

Position

Per

cent

age

Cor

rect

100

90

80

70

60

50

40

30

20

10

01 2 3 4 5 6 7

Figure 5 Mean accuracy as a function of probe position for each fixation condition

888 IRWIN AND ZELINSKY

1981) Figure 5 shows accuracy as a function of whichposition in the scene was probed for each fixation con-dition Position 1 refers to the leftmost object in thescene Position 7 to the rightmost and Position 4 to theobject directly above the fixation point Accuracy in theone-fixation condition showed a U-shaped function withposition so that objects at the sides of the display wereremembered better than objects at other array positionsThere are probably several factors contributing to this re-sult There is less lateral masking for the two end itemsprevious researchers have found that both retinal acuityand visual attention gradients are not circular but ratherare elongated in the horizontal dimension (eg Pan ampEriksen 1993) and subjects may have shifted their at-tention preferentially to the end items as part of a high-level scanning strategy

Of more interest is what happened to the accuracy 3probe position function when more fixations were madeon the display Recall for example that overall accuracyincreased as the number of fixations increased from 1 to3 fixations on the scene Figure 5 shows that not all arraypositions benefited equally however Rather accuracy forthe center-left items in the display increased more than ac-curacy for the center-right items in the display Increasingthe number of fixations from 3 to 5 had similar effects Incontrast accuracy for the rightmost items in the sceneincreased whereas accuracy for the leftmost items in thescene decreased when the number of fixations on thescene increased from 5 to 9 and from 9 to 15 In generalFigure 5 shows that peak accuracy for positions in thearray occurs early for positions on the left side of thearray and late for positions on the right side of the array

It seemed possible that the changes in accuracy acrossprobe position with increasing number of fixations onthe scene might be due to the subjectsrsquo viewing strategies

Figure 6 shows the percentage of fixations on different ob-ject positions in the scene as a function of fixation num-ber (eg 33 of all second fixations fell on Position 13 of all second fixations fell on Position 2 and so on)1In general the pattern of fixations looks very similar tothe pattern of accuracies shown in Figure 5 The subjectstended to fixate the leftmost items in the array early inscene viewing but shifted to rightward positions later inscene viewing There was a great deal of variability be-tween and even within subjects in terms of their viewingstrategies however Two subjects (1 in Version 1 of theexperiment and 1 in Version 2) fixated the objects fromleft to right on most trials 1 subject fixated the objectsfrom right to left on most trials 1 subject moved from leftto right on half the trials and right to left on the other half1 usually fixated Position 1 then Position 4 then Posi-tion 7 skipping over the other positions 1 usually fixatedPosition 4 first and then moved left 1 usually fixated Po-sition 4 first and then moved right and the remainderused a mixture of these strategies changing from one toanother during the course of the experiment There waslittle difference in accuracy between the 3 subjects whoconsistently used a left-to-right or a right-to-left strategy(591) and the 9 who did not (628)

The similarity between the pattern of accuracies in Fig-ure 5 and the pattern of eye fixations in Figure 6 suggeststhat accuracy was affected by whether (and perhaps when)the probed item was foveated during scene presentationThis was examined directly in the next two analyses

Accuracy for Foveated Versus NonfoveatedObjects

On some trials the subjects foveated the object thathappened to be probed whereas on other trials they didnot Figure 7 shows accuracy for foveated versus non-

Position

Per

cent

age

of F

ixat

ions

40

35

30

25

20

15

10

5

01 2 3 4 5 6 7

Fix 2

Fix 3

Fix 4

Fix 5

Fix 6

Fix 7

Fix 8

Fix 9

Fix 10

Fix 11

Fix 12

Fix 13

Fix 14

Fix 15

Figure 6 The percentage of fixations on each object position as a function of fixation number

EYE MOVEMENTS AND MEMORY 889

foveated targets as a function of the number of fixationson the scene No targets were foveated in the 1-fixationcondition because the scene disappeared during the firstsaccade In the 3- 5- and 9-fixation conditions how-ever accuracy was much higher if the position that wasprobed had been foveated as compared with when it hadnot been foveated during scene presentation [929 vs530 in the 3-fixation case t(11) 5 76 p 001842 vs 565 in the 5-fixation case t(5) 5 26 p 05 858 vs 559 in the 9-fixation case t(5) 5 53p 005] The benefit of foveation (808 vs 744)was not significant when 15 fixations had been made onthe scene however [t(5) 5 10 p 35] An inspectionof Figure 7 suggests that this occurred for two reasonsFirst accuracy for nonfoveated objects increased sub-

stantially (from 530 to 744) as the number of fixa-tions on the scene increased from 3 to 15 presumablybecause with an increasing number of f ixations theeyes were landing near the nonfoveated objects even ifthey were not landing on them second accuracy forfoveated objects decreased as the number of fixations in-creased This suggests that recency of fixation on the tar-get object might also be important (ie the 15-fixationcondition includes fixations on the target further back intime than does the 3-fixation condition) This was ex-amined in the next analysis

Accuracy as a Function of Fixation RecencyFigure 8 shows accuracy for foveated targets only as a

function of when during the trial the object had been

Number of Fixations

Per

cent

age

Cor

rect

100

90

80

70

60

50

40

30

20

10

0

1 3 5 9 15

FoveatedNonfoveated

Figure 7 Mean accuracy for foveated versus nonfoveated targets as a function of thenumber of fixations on the scene (an error bar represents the standard error of themean)

Recency of Fixation

Per

cent

age

Cor

rect

100

90

80

70

60

50

40

30

20

10

00 ndash1 ndash2 ndash3 ndash 4 ndash 5

Figure 8 Mean accuracy for foveated targets as a function only of when during the trialthe target had been foveated (an error bar represents the standard error of the mean)

890 IRWIN AND ZELINSKY

foveated If the object that had been foveated just beforethe critical saccade (ie the saccade that terminated thescene) happened to be probed for report (denoted as fix-ation recency 0 in the figure) the subjects reported itcorrectly on over 90 of the trials If the object probedfor report had been foveated 1 or 2 fixations earlier (cor-responding to fixation recencies 21 and 22 in the fig-ure) accuracy was slightly lower (approximately 80)Probing an object that had been foveated 3 or more fix-ations before the critical saccade resulted in accuraciesthat were considerably lower (approximately 65) andnot much different from the accuracy obtained when anonfoveated object had been probed for report (59)The effect of fixation recency (using six levels of re-cency) was significant [F(525) 5 285 MSe 5 2432p 05] in a one-way repeated measures ANOVA thatwas conducted on the data from Version 2 of the experi-ment (this version allowed 3 9 or 15 fixations on thedisplay and thus provided a wide range of fixation re-cencies) A planned comparison based on the error termof the ANOVA showed that differences of 145 weresignificant at the 05 level thus accuracy for the threeobjects foveated just before the critical saccade (iewith fixation recencies of 0 21 and 22) was signifi-cantly higher than accuracy for objects foveated 3 ormore fixations back These results suggest that fixationrecency does influence accuracy in particular it appearsthat people remember the last three objects foveated inthe scene much better than they remember objectsfoveated earlier in time (see Zelinsky amp Loschky 1998for similar results) There was no evidence of a primacyeffect if the object probed for report was the first objectfoveated (and had been foveated 3 or more fixations be-fore the critical saccade) accuracy was 66

Accuracy as a Function of the Number ofObjects Foveated

Given the importance of foveation to scene memorythat is revealed by the preceding analyses one might ex-pect that overall accuracy would depend on the numberof unique objects that were foveated in the scene (recallthat 7 objects appeared in the scene) This relationship isshown in Figure 9 Accuracy increased as 0ndash6 objectswere foveated but there was no additional increase in ac-curacy from 6 to 7 objects Accuracy peaked at 80 evenwhen 6 or 7 unique objects were foveated An accuracylevel of 80 corresponds to memory for 54 objects inthe scene2 This result suggests that scene representa-tions contain no more than about f ive object positionpairings regardless of the number of f ixations on thescene This is not to say that 5 discrete objects in all theirfull detail are stored on every trial however rather thisestimate represents the average amount of informationstored from what is most likely a stochastic processThus it is probably most accurate to say that approxi-mately 5 objectsrsquo ldquoworthrdquo of information is stored in theon-line scene representation

Accuracy When Final Saccade Was Directed atthe Target Versus Directed Elsewhere

Previous research has shown that movements of atten-tion precede movements of the eyes to locations in spacethereby facilitating the processing of information that ispresent at the saccade target location (eg Deubel ampSchneider 1996 Henderson 1993 Henderson Pollat-sek amp Rayner 1989 Hoffman amp Subramaniam 1995Irwin amp Gordon 1998 Klein 1980 Klein amp Pontefract1994 Kowler Anderson Dosher amp Blaser 1995Rayner McConkie amp Ehrlich 1978 Shepherd Findlay

0 1 2 3 4 5 6 7

100

90

80

70

60

50

40

30

20

10

0

Number of Objects Foveated

Per

cent

age

Cor

rect

Figure 9 Mean accuracy as a function of the number of unique objects that were foveatedduring scene presentation (an error bar represents the standard error of the mean)

EYE MOVEMENTS AND MEMORY 891

amp Hockey 1986) Thus we expected that accuracywould be higher when the critical saccade was directedat the object that happened to be probed on any giventrial as compared with when the critical saccade was di-rected toward an unprobed object To elaborate considera one-fixation trial in which the baby bottle was pre-sented in the leftmost position of the display Supposethat the subject moved hisher eyes from the central fix-ation box toward this position Because this was a one-fixation trial the scene disappeared as soon as the sac-cade was initiated to the bottle so nothing was presenton the screen when the eyes landed A short time laterthe probe was presented at one of the display positionsand on some trials it just happened to be presented at thelocation to which the eyes were sent on the critical sac-cade the question of interest is whether accuracy washigher when the probed item happened to be the itemthat the eyes were sent to on the critical saccade thanwhen it was not

Figure 10 shows accuracy for trials in which theprobed item was or was not the target of the critical sac-cade in each fixation condition (by target of the saccadewe mean that the eyes landed within the virtual 24ordmsquare bounding box that defined object position) Onlytrials in which the probed item was not fixated at anytime during scene presentation are included In everyfixation condition accuracy was much higher when theprobed item was the target of the final saccade than whenit was not [778 vs 309 in the 1-f ixation caset(5) 5 26 p 05 929 vs 515 in the 3-fixationcase t(11) 5 60 p 001 879 vs 513 in the 5-fixation case t(5) 5 36 p 02 998 vs 658 in the9-fixation case t(5) 5 102 p 001 and 983 vs751 in the 15-fixation case t(5) 5 49 p 005]

In sum accuracy at reporting the probed item wasvery high if the probe happened to be presented at the lo-cation to which the eyes had been sent on the critical sac-cade even though the object at that location had neverbeen foveated during the trial merely being the target ofthe final saccade increased its memorability consider-ably (see Henderson amp Hollingworth 1999b for similarfindings) This presumably occurred because attentionpreceded the eyes to the saccade target location The ef-fect of attention is to increase the likelihood that infor-mation at the saccade target location will be encodedinto onersquos mental representation of the scene In fact itis possible that many of the beneficial effects of foveat-ing an object discussed above are not due to foveationper se but rather to the attention shift that preceded it

DISCUSSION

In the last 20 years there has been an explosion of in-terest in the properties of on-line scene representation(eg Aginsky amp Tarr 2000 Blackmore et al 1995De Graef 1992 Grimes 1996 Hayhoe 2000 Hender-son amp Hollingworth 1999a 1999b Hollingworth ampHenderson 2000 2002 Hollingworth et al 2001 In-traub 1997 Irwin 1991 1992 Irwin amp Andrews 1996Irwin et al 1988 Irwin et al 1983 Irwin Zacks ampBrown 1990 Levin amp Simons 1997 McConkie amp Cur-rie 1996 Mondy amp Coltheart 2000 OrsquoRegan 1992OrsquoRegan amp Levy-Schoen 1983 Pringle Irwin Krameramp Atchley 2001 Rayner amp Pollatsek 1983 Rensink2000 Rensink et al 1997 Simons 1996 2000 Simonsamp Levin 1998 Wallis amp Bulthoff 2000) One commonfinding is that scene representations appear to be sparseand undetailed Consistent with previous research the

1 3 5 9

100

90

80

70

60

50

40

30

20

10

0

Number of Fixations

Per

cent

age

Cor

rect

15

DirectedNot Directed

Figure 10 Mean accuracy when the final saccade was directed at the target versusnot directed at the target as a function of the number of fixations on the scene (anerror bar represents the standard error of the mean)

892 IRWIN AND ZELINSKY

results of our experiment also indicate that scene repre-sentations are relatively sparse even after 15 fixationson a scene our subjects remembered the positionidentitypairings of only about 78 of the objects in our seven-object displays or the equivalent of about five objectsrsquoworth of information But more important our resultsprovide new information about what factors influencethe contents of the on-line scene representation and howthis representation changes during the course of sceneviewing In particular our analyses suggest that the lastthree items that are foveated and the item about to befoveated are actually remembered quite well much bet-ter than other items in a scene In other words on-linescene representations are dynamic so that the memorytraces of recently fixated objects are much stronger thanthose of objects viewed several fixations back in timeInformation about a scene does appear to accumulateover multiple fixations but the capacity of the on-linescene representation appears to be limited to about fiveitems In most respects the results of the present studyare consistent with the results of other studies that haveused the transsaccadic partial-report technique to inves-tigate integration across saccades with nonscene dis-plays (eg Irwin 1992 Irwin amp Andrews 1996 Irwinamp Gordon 1998)

What is the nature of the memory representation under-lying performance in our task Given that the same sevenobjects appeared in fixed positions on each trial onemight imagine that subjects would simply adopt a left-to-right reading strategy and would store an ordered list ofobject names in verbal working memory Then when theprobe was presented they would scan their memorizedlist and report the appropriate item Three aspects of ourdata seem inconsistent with this hypothesis howeverFirst as was noted earlier for the most part the subjectsdid not use a left-to-right reading strategy nor did theyeven adopt any consistent strategy across trials Thisvariability would greatly complicate memory retrieval ifthe subjects had stored an ordered list of object names inverbal working memory because there would be no con-sistent mapping between probe position and the positionof an item in verbal working memory That is if thememory probe appeared at Position 6 say this wouldcorrespond to the sixth position in the list if the subjectscanned the array from left to right the second positionin the list if the subject scanned from right to left thethird position if the subject started in the middle and thenmoved right and so on The fact that the subjects did notadopt a fixed scanning order suggests to us that theywere not storing an ordered list of object names in ver-bal working memory Second the fact that the subjectsremembered only a maximum of five objects from thedisplay also seems inconsistent with the verbal workingmemory hypothesis because the capacity of verbalworking memory is generally assumed to be betweenfive and nine items given 5 sec to study the display wewould think that the subjects would be able to remember

more than five items if they were using verbal workingmemory Finally the absence of a primacy effect in ourdata also seems inconsistent with the verbal workingmemory hypothesis

Instead we believe that our results are more consistentwith the object-f ile theory of transsaccadic memory(Irwin 1992 1996 Irwin amp Andrews 1996) This theoryis based on the theoretical framework for object percep-tion proposed by Kahneman and Treisman (1984) andKahneman Treisman and Gibbs (1992) It contains fourlevels of representation feature maps which register in-dependently the presence of different sensory features ina scene such as color and shape a master map of loca-tions which registers where in a scene features are lo-cated temporary object representations called objectfiles which are formed when features are conjoined intounitary wholes via attention and which thus representwhat objects are located where in a scene and an abstractlong-term recognition network that stores descriptions ofobjects along with their names Object files may containvisual verbal and semantic information about objectsbecause they have access to long-term memory as well asto perceptual input According to the object-file theoryof transsaccadic memory relatively little information ispreserved across eye movements Rather transsaccadicmemory (and by extension the on-line scene representa-tion) consists of 3mdash5 object files that contain positionand identity information for items that have been at-tended in the scene and of residual activation in the fea-ture maps and in the long-term recognition network Theresults of the present study are consistent with the object-file theory because even after 15 fixations on a scenememory for the scene was limited and consisted largelyof the last few objects that had been attended in the scene

Although the results of the present study are consis-tent with the object-file theory they seem not entirelyconsistent with another theory of scene representationproposed recently by Rensink (2000) According toRensinkrsquos coherence theory in the absence of focusedattention volatile low-level proto-objects are continuallybeing formed and replaced rapidly and in parallel acrossthe visual field reflecting whatever is present on theretina at any moment in time Focused attention providesa coherence field that selects a small number of theseproto-objects to form a stable individuated object thathas spatiotemporal continuity This stable object doesnot necessarily correspond to a single element in the vi-sual field however but may represent a supra-object (cfPylyshynrsquos FINSTs Pylyshyn amp Storm 1988) After fo-cused attention is released the object loses its coherenceand dissolves back into its constituent proto-object partsOnce that occurs there is little or no consequence of hav-ing been attended According to coherence theory visualshort-term memory corresponds to the coherence fieldit contains object tokens (ie specific instantiations at aparticular place and time) that are currently in the focusof attention but that disappear when attention is with-

EYE MOVEMENTS AND MEMORY 893

drawn The coherence theory does allow for the exis-tence of a nonvisual short-term memory for previouslyattended items but it posits that this memory containsinformation only about object types (general definitionsabstracted away from specific spatiotemporal contexts)

The results of the present study seem to raise problemsfor the coherence theory because object tokens were re-tained in memory for recently fixated objects that wereno longer in the focus of attention Consistent with co-herence theory we found that accuracy was very highwhen the object at the final saccade target location(hence the object at the focus of attention) was probedfor report However we also found that accuracy wasquite high for the three objects that had been fixated justprior to this Since attention movements precede eyemovements in an obligatory fashion the three objectsviewed in prior fixations could not also be at the focus ofattention The fact that accuracy was not uniform for thefour objects in question also indicates that they were notall simultaneously at the focus of attention Rather theobjects viewed during the fixations just preceding thefinal saccade benefited from prior attentional process-ing This is inconsistent with the claim that once focusedattention is released objects dissolve back into their con-stituent proto-objects with no consequence of havingbeen attended Note also that object tokens and not ob-ject types were being remembered in our experiment inorder to respond correctly to the partial-report probe thesubjects had to remember the position and the identityof the probed itemmdashthat is they had to remember an ob-ject token Merely remembering that an object type hadappeared somewhere in the display would not lead to anaccurate response especially since the same seven itemsappeared on every trial

Although the present study provides important infor-mation about the factors that influence the contents of on-line scene representations two limitations should benoted One is that our study measured only short-termmemory contributions to the on-line scene representationand not potential long-term memory contributions Thesame seven objects appeared in the same scene context onevery trial so top-down contributions to performancewere essentially eliminated In the real world scene rep-resentations are bound to be influenced by knowledgeabout what kinds of objects are typically found in a par-ticular scene context and where those objects are usuallylocated Indeed recent research by Hollingworth andHenderson (2002 see also Hollingworth et al 2001) inwhich a novel saccade-contingent change detection pro-cedure was used has demonstrated that long-term mem-ory plays a crucial role in the construction and mainte-nance of on-line scene representations

A second limitation of our study is that it involvedonly a test of explicit memory Recently several investi-gators have found evidence that explicit tests of scenememory may underestimate the amount of informationin the representation (eg Fernandez-Duque amp Thornton2000 Hollingworth et al 2001 Williams amp Simons

2000) For example Hollingworth et al found that chang-ing an object in a scene often affected fixation durationon that object even though subjects failed to report thata change had occurred This suggests that at some levelthe perceptual system detected the change even thoughthe subjects were not explicitly aware of it It is impor-tant to note that the implicit effects that have been ob-served have often been small however Furthermoreeven though implicit measures of performance (eg eyemovements) appear to be more sensitive to change thanare explicit measures it does not follow from this thatscene representations contain a great deal more infor-mation than has been measured by explicit tests Explicittests may underestimate the amount of information in thescene representation but there is no evidence that theyunderestimate it by much This is still an open question

REFERENCES

Aginsky V amp Tarr M (2000) How are different properties of a sceneencoded in visual memory Visual Cognition 7 147-162

Averbach E amp Coriell E (1961) Short-term memory in visionBell System Technical Journal 40 309-328

Blackmore S J Brelstaff G Nelson K amp Troscianko T(1995) Is the richness of our visual world an illusion Transsaccadicmemory for complex scenes Perception 24 1075-1081

Bridgeman B Hendry D amp Stark L (1975) Failure to detect dis-placement of the visual world during saccadic eye movements Vi-sion Research 15 719-722

Bridgeman B amp Mayer M (1983) Failure to integrate visual infor-mation from successive fixations Bulletin of the Psychonomic Soci-ety 21 285-286

Bridgeman B van der Heijden A amp Velichkovsky B (1994) Atheory of visual stability across saccadic eye movements Behavioralamp Brain Sciences 17 247-292

Busey T amp Loftus G (1994) Sensory and cognitive components ofvisual information acquisition Psychological Review 101 446-469

Currie C B McConkie G W Carlson-Radvansky L A ampIrwin D E (2000) The role of the saccade target object in the per-ception of a visually stable world Perception amp Psychophysics 62673-683

De Graef P (1992) Scene-context effects and models of real-worldperception In K Rayner (Ed) Eye movements and visual cognitionScene perception and reading (pp 243-259) New York Springer-Verlag

Deubel H amp Schneider W X (1996) Saccade target selection andobject recognition Evidence for a common attentional mechanismVision Research 36 1827-1837

Fernandez-Duque D amp Thornton I M (2000) Change detectionwithout awareness Do explicit reports underestimate the representa-tion of change in the visual system Visual Cognition 7 324-344

Grimes J (1996) On the failure to detect changes in scenes across sac-cades In K Akins (Ed) Perception (Vancouver Studies in CognitiveScience Vol 5 pp 89-110) New York Oxford University Press

Hayhoe M M (2000) Vision using routines A functional account ofvision Visual Cognition 7 43-64

Henderson J M (1993) Visual attention and saccadic eye move-ments In G drsquoYdewalle amp J Van Rensbergen (Eds) Perception andcognition Advances in eye-movement research (pp 37-50) Amster-dam North-Holland

Henderson J M (1997) Transsaccadic memory and integration dur-ing real-world object perception Psychological Science 8 51-55

Henderson J M amp Hollingworth A (1999a) High-level sceneperception Annual Review of Psychology 50 243-271

Henderson J M amp Hollingworth A (1999b) The role of fixationposition in detecting scene changes across saccades PsychologicalScience 10 438-443

894 IRWIN AND ZELINSKY

Henderson J M Pollatsek A amp Rayner K (1989) Covert visualattention and extrafoveal information use during object identifica-tion Perception amp Psychophysics 45 196-208

Hochberg J (1986) Representation of motion and space in video andcinematic displays In K R Boff L Kaufman amp J P Thomas (Eds)Handbook of perception and human performance Vol 1 Sensoryprocesses and perception (pp 221-2264) New York Wiley

Hoffman J E amp Subramaniam B (1995) The role of visual atten-tion in saccadic eye movements Perception amp Psychophysics 57787-795

Hollingworth A amp Henderson J M (2000) Semantic informa-tiveness mediates the detection of changes in natural scenes VisualCognition 7 213-235

Hollingworth A amp Henderson J M (2002) Accurate visualmemory for previously attended objects in natural scenes Journal ofExperimental Psychology Human Perception amp Performance 28113-136

Hollingworth A Williams C C amp Henderson J M (2001) Tosee and remember Visually specific information is retained in mem-ory from previously attended objects in natural scenes PsychonomicBulletin amp Review 8 761-768

Intraub H (1997) The representation of visual scenes Trends in Cog-nitive Sciences 1 217-222

Irwin D E (1991) Information integration across saccadic eye move-ments Cognitive Psychology 23 420-456

Irwin D E (1992) Memory for position and identity across eye move-ments Journal of Experimental Psychology Learning Memory ampCognition 18 307-317

Irwin D E (1996) Integrating information across saccadic eye move-ments Current Directions in Psychological Science 5 94-100

Irwin D E amp Andrews R V (1996) Integration and accumulationof information across saccadic eye movements In T Inui amp J L Mc-Clelland (Eds) Attention and performance XVI Information inte-gration in perception and communication (pp 125-155) CambridgeMA MIT Press

Irwin D E Brown J S amp Sun J-S (1988) Visual masking and vi-sual integration across saccadic eye movements Journal of Experi-mental Psychology General 117 276-287

Irwin D E amp Gordon R D (1998) Eye movements attention andtranssaccadic memory Visual Cognition 5 127-155

Irwin D E Yantis S amp Jonides J (1983) Evidence against visualintegration across saccadic eye movements Perception amp Psycho-physics 34 49-57

Irwin D E Zacks J L amp Brown J S (1990) Visual memory andthe perception of a stable visual environment Perception amp Psycho-physics 47 35-46

Kahneman D amp Treisman A (1984) Changing views of attentionand automaticity In R Parasuraman amp D R Davies (Eds) Varietiesof attention (pp 29-61) Orlando FL Academic Press

Kahneman D Treisman A amp Gibbs B J (1992) The reviewing ofobject f iles Object-specific integration of information CognitivePsychology 24 175-219

Klein R (1980) Does oculomotor readiness mediate cognitive controlof visual attention In R S Nickerson (Ed) Attention and perfor-mance VIII (pp 259-276) Hillsdale NJ Erlbaum

Klein R amp Pontefract A (1994) Does oculomotor readiness me-diate cognitive control of visual attention Revisited In C Umiltagrave ampM Moskovitch (Eds) Attention and performance XV Consciousand nonconscious information processing (pp 333-350) CambridgeMA MIT Press Bradford Books

Kowler E Anderson E Dosher B amp Blaser E (1995) The roleof attention in the programming of saccades Vision Research 351897-1916

Levin D T amp Simons D J (1997) Failure to detect changes to at-tended objects in motion pictures Psychonomic Bulletin amp Review4 501-506

Matin E (1974) Saccadic suppression A review and an analysis Psy-chological Bulletin 81 899-917

McConkie G W amp Currie C B (1996) Visual stability across sac-cades while viewing complex pictures Journal of Experimental Psy-chology Human Perception amp Performance 22 563-581

McConkie G W amp Zola D (1979) Is visual information integratedacross successive fixations in reading Perception amp Psychophysics25 221-224

Mewhort D J K Campbell A J Marchetti F M amp CampbellJ I D (1981) Identif ication localization and ldquoiconic memoryrdquo Anevaluation of the bar-probe task Memory amp Cognition 9 50-67

Mondy S amp Coltheart V (2000) Detection and identification ofchange in naturalistic scenes Visual Cognition 7 281-296

OrsquoRegan J K (1992) Solving the ldquorealrdquo mysteries of visual percep-tion The world as an outside memory Canadian Journal of Psy-chology 46 461-488

OrsquoRegan J K amp Levy-Schoen A (1983) Integrating visual infor-mation from successive fixations Does trans-saccadic fusion existVision Research 23 765-768

Pan K amp Eriksen C W (1993) Attentional distribution in the visualfield during samendashdifferent judgments as assessed by response com-petition Perception amp Psychophysics 53 134-144

Pashler H (1988) Familiarity and visual change detection Percep-tion amp Psychophysics 44 369-378

Phillips W A (1974) On the distinction between sensory storage andshort-term visual memory Perception amp Psychophysics 16 283-290

Pollatsek A Rayner K amp Henderson J (1990) The role of spa-tial location in the integration of pictorial information across sac-cades Journal of Experimental Psychology Human Perception ampPerformance 16 199-210

Pringle H L Irwin D E Kramer A F amp Atchley P (2001)The role of attentional breadth in perceptual change detection Psy-chonomic Bulletin amp Review 8 89-95

Pylyshyn Z W amp Storm R W (1988) Tracking multiple indepen-dent targets Evidence for a parallel tracking mechanism Spatial Vi-sion 3 179-197

Rayner K McConkie G W amp Ehrlich S (1978) Eye movement sand integrating information across f ixations Journal of Experimen-tal Psychology Human Perception amp Performance 4 529-544

Rayner K McConkie G W amp Zola D (1980) Integrating infor-mation across eye movements Cognitive Psychology 12 206-226

Rayner K amp Pollatsek A (1983) Is visual information integratedacross saccades Perception amp Psychophysics 34 39-48

Rensink R A (2000) The dynamic representation of scenes VisualCognition 7 17-42

Rensink R A OrsquoRegan J K amp Clark J J (1997) To see or not tosee The need for attention to perceive changes in scenes Psycho-logical Science 8 368-373

Shepherd M Findlay J amp Hockey R (1986) The relationship be-tween eye movements and spatial attention Quarterly Journal of Ex-perimental Psychology 38A 475-491

Simons D J (1996) In sight out of mind When object representa-tions fail Psychological Science 7 301-305

Simons D J (2000) Current approaches to change blindness VisualCognition 7 1-15

Simons D J amp Levin D T (1997) Change blindness Trends in Cog-nitive Sciences 1 261-267

Simons D J amp Levin D T (1998) Failure to detect changes to peo-ple during a real-world interaction Psychonomic Bulletin amp Review5 644-649

Sperling G (1960) The information available in brief visual presen-tations Psychological Monographs 74(11 Whole No 498)

Van Dijk T amp Kintsch W (1983) Strategies of discourse compre-hension New York Academic Press

Wallis G amp Bulthoff H (2000) Whatrsquos scene and not seen Influ-ences of movement and task upon what we see Visual Cognition 7175-190

Williams P amp Simons D J (2000) Detecting changes in novel 3Dobjects Effects of change magnitude spatiotemporal continuity andstimulus familiarity Visual Cognition 7 297-322

Zelinsky G J (1999) Precuing target location in a variable set sizeldquononsearchrdquo task Dissociating search-based and interference-basedexplanations for set size effects Journal of Experimental Psychol-ogy Human Perception amp Performance 25 875-903

Zelinsky G J (2001) Eye movements during change detection Im-

EYE MOVEMENTS AND MEMORY 895

plications for search constraints memory limitations and scanningstrategies Perception amp Psychophysics 63 209-225

Zelinsky G J amp Loschky L (1998) Toward a realistic assessmentof visual working memory Investigative Ophthalmology amp VisualScience 39 S224

Zelinsky G J Rao R Hayhoe M amp Ballard D (1997) Eyemovements reveal the spatiotemporal dynamics of visual search Psy-chological Science 8 448-453

NOTES

1 The percentage of f ixations summed across positions in Figure 6does not add up to 100 because some fixations did not fall on any ob-ject position This happened relatively rarely for Fixations 3ndash15 rang-ing from 3 to 9 of all the fixations It happened quite often for Fix-ation 2 (the first real f ixation on the scene since Fixation 1 was on thefixation point) accounting for 45 of all the fixations Approximately

67 of these fell near an object position whereas the remainder werescattered randomly in the central area of the scene these may have beenldquocenter of massrdquo fixations like those previously observed by ZelinskyRao Hayhoe and Ballard (1997)

2 In order to determine how many objects would have to be stored inmemory to produce an accuracy level of 80 we first to correct forguessing (Busey amp Loftus 1994) applied the formula p 5 (x 2 g)(1 2g) where x is the raw proportion correct g is the guessing probabilityand p is the corrected proportion correct (note that g 5 143 or 17 be-cause there were always seven objects in the array) We then multipliedp by the number of objects in the display in order to estimate the num-ber of objects remembered (Sperling 1960)

(Manuscript received March 9 2001revision accepted for publication November 28 2001)

EYE MOVEMENTS AND MEMORY 885

tion In particular to make certain that the subjects wereindeed looking at this cross the study scene would not ap-pear until the gaze deviated from the center of the cross byless than 1ordm at the time of the buttonpress This precautioneliminated the need to discard trials caused by anticipa-

tory saccades to the fixed display locations Thus trialdata were deleted from analysis in Version 2 of the exper-iment only if the eyetracker lost track of the eye duringscene blank or probe presentation The proportion ofdeleted trials in Version 2 did not vary significantly across

Figure 2 Highlights of the experimental procedure

A

B

C

886 IRWIN AND ZELINSKY

conditions 54 of the trials were deleted from the3-fixation condition 109 from the 9-fixation condi-tion and 92 from the 15-fixation condition

Scene Viewing Time and Number of ObjectsFoveated

Because the duration of the display on each trial wasdetermined by fixation condition scene viewing time in-creased as the number of fixations allowed on the sceneincreased (see Table 1) The mean viewing times rangedfrom 174 msec in the 1-fixation condition to 4927 msecin the 15-fixation condition As one might expect thenumber of objects foveated in the scene also increasedwith the number of fixations allowed on the scene In the1-fixation condition the scene was erased as soon as theeyes left the fixation box at the bottom of the scene sono objects were foveated in this condition At the otherextreme an average of 51 objects were foveated in the15-fixation condition (see Table 1) The number of ob-jects foveated in each condition was less than the num-ber of fixations because some fixations fell on no objectand some objects were foveated more than once Meanfixation duration tended to increase as the number of fix-ations on the scene increased (see Figure 3) The first 2fixations on the scene were quite short (approximately150 msec in duration) whereas the duration of subse-quent fixations was captured reasonably well (r2 5 77)by the following equation fixation duration 5 2376 155(fixation number)

Given these differences in viewing time and in the num-ber of objects foveated it would be reasonable to expectthat overall accuracy in reporting the probed object wouldincrease across fixation conditions This is discussed next

Overall AccuracyIn Version 1 of the experiment mean accuracy was

360 in the 1-fixation condition 658 in the 3-fixationcondition and 672 in the 5-fixation condition The ef-fect of f ixation condition was significant [F(210) 5419 MSe 5 446 p 001] A planned comparison basedon the error term of the analysis of variance (ANOVA)showed that differences of 78 were significant at the05 level thus all pairwise comparisons were significantexcept for the difference between the 3-fixation condi-tion and the 5-fixation condition In Version 2 of the ex-periment mean accuracy was 545 in the 3-fixationcondition 707 in the 9-fixation condition and 777in the 15-fixation condition The effect of fixation con-dition was significant [F(210) 5 288 MSe 5 294 p 001] A planned comparison based on the error term ofthe ANOVA showed that differences of 63 were sig-nificant at the 05 level thus all pairwise comparisonswere significant A t test comparing the two 3-fixationconditions was not significant [t(10) 5 21 p 05]Mean accuracy at reporting the probed object as a func-tion of the number of fixations on the scene is shown inFigure 4 (the data from the 3-fixation conditions werecombined for this figure)

A number of secondary analyses were done to inves-tigate whether practice or learning had any effect on ac-curacy Because the same objects appeared in differentpositions on every trial it seemed possible that the mem-ory of where an object appeared on trial n 2 1 might in-fluence recall of where that object had appeared on trialn (eg because of proactive interference) There was noevidence that accuracy declined over trials however InVersion 1 of the experiment mean accuracy during thefirst second and last third of the experiment was 527

Table 1Viewing Time and Number of Objects Foveated as a Function

of Fixation Condition

Fixation Condition Viewing Time (msec) Objects Foveated

1 174 0003 739 1255 1360 2689 2810 347

15 4927 510

Figure 3 Mean fixation duration as a function of number of fixations on the scene

EYE MOVEMENTS AND MEMORY 887

551 and 589 In Version 2 of the experiment thecorresponding accuracies were 683 671 and683 There was also no evidence that probing thesame target object (eg the duck) in different positionson successive trials influenced performance accuracywas 626 when the target object on trial n had also beenthe target object (but in a different position) on trial n 21 and accuracy was 621 when different target objectswere probed on trial n and trial n 2 1 (there were no tri-als in which the same target object appeared in the same

probed position on successive trials) In sum there wasno evidence that learning or repetition effects interferedwith memory recall

The analyses that follow were designed to illuminatewhat factors did affect overall accuracy

Accuracy 3 Probe PositionPrevious partial-report experiments have found that

accuracy depends strongly on an itemrsquos position in a dis-play (eg Mewhort Campbell Marchetti amp Campbell

Fixation Condition

Per

cent

age

Cor

rect

100

90

80

70

60

50

40

30

20

10

01 3 5 9 15

Figure 4 Mean accuracy as a function of the number of fixations on the scene (error barsrepresent the standard error of the mean)

1 fixation3 fixations5 fixations9 fixations15 fixations

Position

Per

cent

age

Cor

rect

100

90

80

70

60

50

40

30

20

10

01 2 3 4 5 6 7

Figure 5 Mean accuracy as a function of probe position for each fixation condition

888 IRWIN AND ZELINSKY

1981) Figure 5 shows accuracy as a function of whichposition in the scene was probed for each fixation con-dition Position 1 refers to the leftmost object in thescene Position 7 to the rightmost and Position 4 to theobject directly above the fixation point Accuracy in theone-fixation condition showed a U-shaped function withposition so that objects at the sides of the display wereremembered better than objects at other array positionsThere are probably several factors contributing to this re-sult There is less lateral masking for the two end itemsprevious researchers have found that both retinal acuityand visual attention gradients are not circular but ratherare elongated in the horizontal dimension (eg Pan ampEriksen 1993) and subjects may have shifted their at-tention preferentially to the end items as part of a high-level scanning strategy

Of more interest is what happened to the accuracy 3probe position function when more fixations were madeon the display Recall for example that overall accuracyincreased as the number of fixations increased from 1 to3 fixations on the scene Figure 5 shows that not all arraypositions benefited equally however Rather accuracy forthe center-left items in the display increased more than ac-curacy for the center-right items in the display Increasingthe number of fixations from 3 to 5 had similar effects Incontrast accuracy for the rightmost items in the sceneincreased whereas accuracy for the leftmost items in thescene decreased when the number of fixations on thescene increased from 5 to 9 and from 9 to 15 In generalFigure 5 shows that peak accuracy for positions in thearray occurs early for positions on the left side of thearray and late for positions on the right side of the array

It seemed possible that the changes in accuracy acrossprobe position with increasing number of fixations onthe scene might be due to the subjectsrsquo viewing strategies

Figure 6 shows the percentage of fixations on different ob-ject positions in the scene as a function of fixation num-ber (eg 33 of all second fixations fell on Position 13 of all second fixations fell on Position 2 and so on)1In general the pattern of fixations looks very similar tothe pattern of accuracies shown in Figure 5 The subjectstended to fixate the leftmost items in the array early inscene viewing but shifted to rightward positions later inscene viewing There was a great deal of variability be-tween and even within subjects in terms of their viewingstrategies however Two subjects (1 in Version 1 of theexperiment and 1 in Version 2) fixated the objects fromleft to right on most trials 1 subject fixated the objectsfrom right to left on most trials 1 subject moved from leftto right on half the trials and right to left on the other half1 usually fixated Position 1 then Position 4 then Posi-tion 7 skipping over the other positions 1 usually fixatedPosition 4 first and then moved left 1 usually fixated Po-sition 4 first and then moved right and the remainderused a mixture of these strategies changing from one toanother during the course of the experiment There waslittle difference in accuracy between the 3 subjects whoconsistently used a left-to-right or a right-to-left strategy(591) and the 9 who did not (628)

The similarity between the pattern of accuracies in Fig-ure 5 and the pattern of eye fixations in Figure 6 suggeststhat accuracy was affected by whether (and perhaps when)the probed item was foveated during scene presentationThis was examined directly in the next two analyses

Accuracy for Foveated Versus NonfoveatedObjects

On some trials the subjects foveated the object thathappened to be probed whereas on other trials they didnot Figure 7 shows accuracy for foveated versus non-

Position

Per

cent

age

of F

ixat

ions

40

35

30

25

20

15

10

5

01 2 3 4 5 6 7

Fix 2

Fix 3

Fix 4

Fix 5

Fix 6

Fix 7

Fix 8

Fix 9

Fix 10

Fix 11

Fix 12

Fix 13

Fix 14

Fix 15

Figure 6 The percentage of fixations on each object position as a function of fixation number

EYE MOVEMENTS AND MEMORY 889

foveated targets as a function of the number of fixationson the scene No targets were foveated in the 1-fixationcondition because the scene disappeared during the firstsaccade In the 3- 5- and 9-fixation conditions how-ever accuracy was much higher if the position that wasprobed had been foveated as compared with when it hadnot been foveated during scene presentation [929 vs530 in the 3-fixation case t(11) 5 76 p 001842 vs 565 in the 5-fixation case t(5) 5 26 p 05 858 vs 559 in the 9-fixation case t(5) 5 53p 005] The benefit of foveation (808 vs 744)was not significant when 15 fixations had been made onthe scene however [t(5) 5 10 p 35] An inspectionof Figure 7 suggests that this occurred for two reasonsFirst accuracy for nonfoveated objects increased sub-

stantially (from 530 to 744) as the number of fixa-tions on the scene increased from 3 to 15 presumablybecause with an increasing number of f ixations theeyes were landing near the nonfoveated objects even ifthey were not landing on them second accuracy forfoveated objects decreased as the number of fixations in-creased This suggests that recency of fixation on the tar-get object might also be important (ie the 15-fixationcondition includes fixations on the target further back intime than does the 3-fixation condition) This was ex-amined in the next analysis

Accuracy as a Function of Fixation RecencyFigure 8 shows accuracy for foveated targets only as a

function of when during the trial the object had been

Number of Fixations

Per

cent

age

Cor

rect

100

90

80

70

60

50

40

30

20

10

0

1 3 5 9 15

FoveatedNonfoveated

Figure 7 Mean accuracy for foveated versus nonfoveated targets as a function of thenumber of fixations on the scene (an error bar represents the standard error of themean)

Recency of Fixation

Per

cent

age

Cor

rect

100

90

80

70

60

50

40

30

20

10

00 ndash1 ndash2 ndash3 ndash 4 ndash 5

Figure 8 Mean accuracy for foveated targets as a function only of when during the trialthe target had been foveated (an error bar represents the standard error of the mean)

890 IRWIN AND ZELINSKY

foveated If the object that had been foveated just beforethe critical saccade (ie the saccade that terminated thescene) happened to be probed for report (denoted as fix-ation recency 0 in the figure) the subjects reported itcorrectly on over 90 of the trials If the object probedfor report had been foveated 1 or 2 fixations earlier (cor-responding to fixation recencies 21 and 22 in the fig-ure) accuracy was slightly lower (approximately 80)Probing an object that had been foveated 3 or more fix-ations before the critical saccade resulted in accuraciesthat were considerably lower (approximately 65) andnot much different from the accuracy obtained when anonfoveated object had been probed for report (59)The effect of fixation recency (using six levels of re-cency) was significant [F(525) 5 285 MSe 5 2432p 05] in a one-way repeated measures ANOVA thatwas conducted on the data from Version 2 of the experi-ment (this version allowed 3 9 or 15 fixations on thedisplay and thus provided a wide range of fixation re-cencies) A planned comparison based on the error termof the ANOVA showed that differences of 145 weresignificant at the 05 level thus accuracy for the threeobjects foveated just before the critical saccade (iewith fixation recencies of 0 21 and 22) was signifi-cantly higher than accuracy for objects foveated 3 ormore fixations back These results suggest that fixationrecency does influence accuracy in particular it appearsthat people remember the last three objects foveated inthe scene much better than they remember objectsfoveated earlier in time (see Zelinsky amp Loschky 1998for similar results) There was no evidence of a primacyeffect if the object probed for report was the first objectfoveated (and had been foveated 3 or more fixations be-fore the critical saccade) accuracy was 66

Accuracy as a Function of the Number ofObjects Foveated

Given the importance of foveation to scene memorythat is revealed by the preceding analyses one might ex-pect that overall accuracy would depend on the numberof unique objects that were foveated in the scene (recallthat 7 objects appeared in the scene) This relationship isshown in Figure 9 Accuracy increased as 0ndash6 objectswere foveated but there was no additional increase in ac-curacy from 6 to 7 objects Accuracy peaked at 80 evenwhen 6 or 7 unique objects were foveated An accuracylevel of 80 corresponds to memory for 54 objects inthe scene2 This result suggests that scene representa-tions contain no more than about f ive object positionpairings regardless of the number of f ixations on thescene This is not to say that 5 discrete objects in all theirfull detail are stored on every trial however rather thisestimate represents the average amount of informationstored from what is most likely a stochastic processThus it is probably most accurate to say that approxi-mately 5 objectsrsquo ldquoworthrdquo of information is stored in theon-line scene representation

Accuracy When Final Saccade Was Directed atthe Target Versus Directed Elsewhere

Previous research has shown that movements of atten-tion precede movements of the eyes to locations in spacethereby facilitating the processing of information that ispresent at the saccade target location (eg Deubel ampSchneider 1996 Henderson 1993 Henderson Pollat-sek amp Rayner 1989 Hoffman amp Subramaniam 1995Irwin amp Gordon 1998 Klein 1980 Klein amp Pontefract1994 Kowler Anderson Dosher amp Blaser 1995Rayner McConkie amp Ehrlich 1978 Shepherd Findlay

0 1 2 3 4 5 6 7

100

90

80

70

60

50

40

30

20

10

0

Number of Objects Foveated

Per

cent

age

Cor

rect

Figure 9 Mean accuracy as a function of the number of unique objects that were foveatedduring scene presentation (an error bar represents the standard error of the mean)

EYE MOVEMENTS AND MEMORY 891

amp Hockey 1986) Thus we expected that accuracywould be higher when the critical saccade was directedat the object that happened to be probed on any giventrial as compared with when the critical saccade was di-rected toward an unprobed object To elaborate considera one-fixation trial in which the baby bottle was pre-sented in the leftmost position of the display Supposethat the subject moved hisher eyes from the central fix-ation box toward this position Because this was a one-fixation trial the scene disappeared as soon as the sac-cade was initiated to the bottle so nothing was presenton the screen when the eyes landed A short time laterthe probe was presented at one of the display positionsand on some trials it just happened to be presented at thelocation to which the eyes were sent on the critical sac-cade the question of interest is whether accuracy washigher when the probed item happened to be the itemthat the eyes were sent to on the critical saccade thanwhen it was not

Figure 10 shows accuracy for trials in which theprobed item was or was not the target of the critical sac-cade in each fixation condition (by target of the saccadewe mean that the eyes landed within the virtual 24ordmsquare bounding box that defined object position) Onlytrials in which the probed item was not fixated at anytime during scene presentation are included In everyfixation condition accuracy was much higher when theprobed item was the target of the final saccade than whenit was not [778 vs 309 in the 1-f ixation caset(5) 5 26 p 05 929 vs 515 in the 3-fixationcase t(11) 5 60 p 001 879 vs 513 in the 5-fixation case t(5) 5 36 p 02 998 vs 658 in the9-fixation case t(5) 5 102 p 001 and 983 vs751 in the 15-fixation case t(5) 5 49 p 005]

In sum accuracy at reporting the probed item wasvery high if the probe happened to be presented at the lo-cation to which the eyes had been sent on the critical sac-cade even though the object at that location had neverbeen foveated during the trial merely being the target ofthe final saccade increased its memorability consider-ably (see Henderson amp Hollingworth 1999b for similarfindings) This presumably occurred because attentionpreceded the eyes to the saccade target location The ef-fect of attention is to increase the likelihood that infor-mation at the saccade target location will be encodedinto onersquos mental representation of the scene In fact itis possible that many of the beneficial effects of foveat-ing an object discussed above are not due to foveationper se but rather to the attention shift that preceded it

DISCUSSION

In the last 20 years there has been an explosion of in-terest in the properties of on-line scene representation(eg Aginsky amp Tarr 2000 Blackmore et al 1995De Graef 1992 Grimes 1996 Hayhoe 2000 Hender-son amp Hollingworth 1999a 1999b Hollingworth ampHenderson 2000 2002 Hollingworth et al 2001 In-traub 1997 Irwin 1991 1992 Irwin amp Andrews 1996Irwin et al 1988 Irwin et al 1983 Irwin Zacks ampBrown 1990 Levin amp Simons 1997 McConkie amp Cur-rie 1996 Mondy amp Coltheart 2000 OrsquoRegan 1992OrsquoRegan amp Levy-Schoen 1983 Pringle Irwin Krameramp Atchley 2001 Rayner amp Pollatsek 1983 Rensink2000 Rensink et al 1997 Simons 1996 2000 Simonsamp Levin 1998 Wallis amp Bulthoff 2000) One commonfinding is that scene representations appear to be sparseand undetailed Consistent with previous research the

1 3 5 9

100

90

80

70

60

50

40

30

20

10

0

Number of Fixations

Per

cent

age

Cor

rect

15

DirectedNot Directed

Figure 10 Mean accuracy when the final saccade was directed at the target versusnot directed at the target as a function of the number of fixations on the scene (anerror bar represents the standard error of the mean)

892 IRWIN AND ZELINSKY

results of our experiment also indicate that scene repre-sentations are relatively sparse even after 15 fixationson a scene our subjects remembered the positionidentitypairings of only about 78 of the objects in our seven-object displays or the equivalent of about five objectsrsquoworth of information But more important our resultsprovide new information about what factors influencethe contents of the on-line scene representation and howthis representation changes during the course of sceneviewing In particular our analyses suggest that the lastthree items that are foveated and the item about to befoveated are actually remembered quite well much bet-ter than other items in a scene In other words on-linescene representations are dynamic so that the memorytraces of recently fixated objects are much stronger thanthose of objects viewed several fixations back in timeInformation about a scene does appear to accumulateover multiple fixations but the capacity of the on-linescene representation appears to be limited to about fiveitems In most respects the results of the present studyare consistent with the results of other studies that haveused the transsaccadic partial-report technique to inves-tigate integration across saccades with nonscene dis-plays (eg Irwin 1992 Irwin amp Andrews 1996 Irwinamp Gordon 1998)

What is the nature of the memory representation under-lying performance in our task Given that the same sevenobjects appeared in fixed positions on each trial onemight imagine that subjects would simply adopt a left-to-right reading strategy and would store an ordered list ofobject names in verbal working memory Then when theprobe was presented they would scan their memorizedlist and report the appropriate item Three aspects of ourdata seem inconsistent with this hypothesis howeverFirst as was noted earlier for the most part the subjectsdid not use a left-to-right reading strategy nor did theyeven adopt any consistent strategy across trials Thisvariability would greatly complicate memory retrieval ifthe subjects had stored an ordered list of object names inverbal working memory because there would be no con-sistent mapping between probe position and the positionof an item in verbal working memory That is if thememory probe appeared at Position 6 say this wouldcorrespond to the sixth position in the list if the subjectscanned the array from left to right the second positionin the list if the subject scanned from right to left thethird position if the subject started in the middle and thenmoved right and so on The fact that the subjects did notadopt a fixed scanning order suggests to us that theywere not storing an ordered list of object names in ver-bal working memory Second the fact that the subjectsremembered only a maximum of five objects from thedisplay also seems inconsistent with the verbal workingmemory hypothesis because the capacity of verbalworking memory is generally assumed to be betweenfive and nine items given 5 sec to study the display wewould think that the subjects would be able to remember

more than five items if they were using verbal workingmemory Finally the absence of a primacy effect in ourdata also seems inconsistent with the verbal workingmemory hypothesis

Instead we believe that our results are more consistentwith the object-f ile theory of transsaccadic memory(Irwin 1992 1996 Irwin amp Andrews 1996) This theoryis based on the theoretical framework for object percep-tion proposed by Kahneman and Treisman (1984) andKahneman Treisman and Gibbs (1992) It contains fourlevels of representation feature maps which register in-dependently the presence of different sensory features ina scene such as color and shape a master map of loca-tions which registers where in a scene features are lo-cated temporary object representations called objectfiles which are formed when features are conjoined intounitary wholes via attention and which thus representwhat objects are located where in a scene and an abstractlong-term recognition network that stores descriptions ofobjects along with their names Object files may containvisual verbal and semantic information about objectsbecause they have access to long-term memory as well asto perceptual input According to the object-file theoryof transsaccadic memory relatively little information ispreserved across eye movements Rather transsaccadicmemory (and by extension the on-line scene representa-tion) consists of 3mdash5 object files that contain positionand identity information for items that have been at-tended in the scene and of residual activation in the fea-ture maps and in the long-term recognition network Theresults of the present study are consistent with the object-file theory because even after 15 fixations on a scenememory for the scene was limited and consisted largelyof the last few objects that had been attended in the scene

Although the results of the present study are consis-tent with the object-file theory they seem not entirelyconsistent with another theory of scene representationproposed recently by Rensink (2000) According toRensinkrsquos coherence theory in the absence of focusedattention volatile low-level proto-objects are continuallybeing formed and replaced rapidly and in parallel acrossthe visual field reflecting whatever is present on theretina at any moment in time Focused attention providesa coherence field that selects a small number of theseproto-objects to form a stable individuated object thathas spatiotemporal continuity This stable object doesnot necessarily correspond to a single element in the vi-sual field however but may represent a supra-object (cfPylyshynrsquos FINSTs Pylyshyn amp Storm 1988) After fo-cused attention is released the object loses its coherenceand dissolves back into its constituent proto-object partsOnce that occurs there is little or no consequence of hav-ing been attended According to coherence theory visualshort-term memory corresponds to the coherence fieldit contains object tokens (ie specific instantiations at aparticular place and time) that are currently in the focusof attention but that disappear when attention is with-

EYE MOVEMENTS AND MEMORY 893

drawn The coherence theory does allow for the exis-tence of a nonvisual short-term memory for previouslyattended items but it posits that this memory containsinformation only about object types (general definitionsabstracted away from specific spatiotemporal contexts)

The results of the present study seem to raise problemsfor the coherence theory because object tokens were re-tained in memory for recently fixated objects that wereno longer in the focus of attention Consistent with co-herence theory we found that accuracy was very highwhen the object at the final saccade target location(hence the object at the focus of attention) was probedfor report However we also found that accuracy wasquite high for the three objects that had been fixated justprior to this Since attention movements precede eyemovements in an obligatory fashion the three objectsviewed in prior fixations could not also be at the focus ofattention The fact that accuracy was not uniform for thefour objects in question also indicates that they were notall simultaneously at the focus of attention Rather theobjects viewed during the fixations just preceding thefinal saccade benefited from prior attentional process-ing This is inconsistent with the claim that once focusedattention is released objects dissolve back into their con-stituent proto-objects with no consequence of havingbeen attended Note also that object tokens and not ob-ject types were being remembered in our experiment inorder to respond correctly to the partial-report probe thesubjects had to remember the position and the identityof the probed itemmdashthat is they had to remember an ob-ject token Merely remembering that an object type hadappeared somewhere in the display would not lead to anaccurate response especially since the same seven itemsappeared on every trial

Although the present study provides important infor-mation about the factors that influence the contents of on-line scene representations two limitations should benoted One is that our study measured only short-termmemory contributions to the on-line scene representationand not potential long-term memory contributions Thesame seven objects appeared in the same scene context onevery trial so top-down contributions to performancewere essentially eliminated In the real world scene rep-resentations are bound to be influenced by knowledgeabout what kinds of objects are typically found in a par-ticular scene context and where those objects are usuallylocated Indeed recent research by Hollingworth andHenderson (2002 see also Hollingworth et al 2001) inwhich a novel saccade-contingent change detection pro-cedure was used has demonstrated that long-term mem-ory plays a crucial role in the construction and mainte-nance of on-line scene representations

A second limitation of our study is that it involvedonly a test of explicit memory Recently several investi-gators have found evidence that explicit tests of scenememory may underestimate the amount of informationin the representation (eg Fernandez-Duque amp Thornton2000 Hollingworth et al 2001 Williams amp Simons

2000) For example Hollingworth et al found that chang-ing an object in a scene often affected fixation durationon that object even though subjects failed to report thata change had occurred This suggests that at some levelthe perceptual system detected the change even thoughthe subjects were not explicitly aware of it It is impor-tant to note that the implicit effects that have been ob-served have often been small however Furthermoreeven though implicit measures of performance (eg eyemovements) appear to be more sensitive to change thanare explicit measures it does not follow from this thatscene representations contain a great deal more infor-mation than has been measured by explicit tests Explicittests may underestimate the amount of information in thescene representation but there is no evidence that theyunderestimate it by much This is still an open question

REFERENCES

Aginsky V amp Tarr M (2000) How are different properties of a sceneencoded in visual memory Visual Cognition 7 147-162

Averbach E amp Coriell E (1961) Short-term memory in visionBell System Technical Journal 40 309-328

Blackmore S J Brelstaff G Nelson K amp Troscianko T(1995) Is the richness of our visual world an illusion Transsaccadicmemory for complex scenes Perception 24 1075-1081

Bridgeman B Hendry D amp Stark L (1975) Failure to detect dis-placement of the visual world during saccadic eye movements Vi-sion Research 15 719-722

Bridgeman B amp Mayer M (1983) Failure to integrate visual infor-mation from successive fixations Bulletin of the Psychonomic Soci-ety 21 285-286

Bridgeman B van der Heijden A amp Velichkovsky B (1994) Atheory of visual stability across saccadic eye movements Behavioralamp Brain Sciences 17 247-292

Busey T amp Loftus G (1994) Sensory and cognitive components ofvisual information acquisition Psychological Review 101 446-469

Currie C B McConkie G W Carlson-Radvansky L A ampIrwin D E (2000) The role of the saccade target object in the per-ception of a visually stable world Perception amp Psychophysics 62673-683

De Graef P (1992) Scene-context effects and models of real-worldperception In K Rayner (Ed) Eye movements and visual cognitionScene perception and reading (pp 243-259) New York Springer-Verlag

Deubel H amp Schneider W X (1996) Saccade target selection andobject recognition Evidence for a common attentional mechanismVision Research 36 1827-1837

Fernandez-Duque D amp Thornton I M (2000) Change detectionwithout awareness Do explicit reports underestimate the representa-tion of change in the visual system Visual Cognition 7 324-344

Grimes J (1996) On the failure to detect changes in scenes across sac-cades In K Akins (Ed) Perception (Vancouver Studies in CognitiveScience Vol 5 pp 89-110) New York Oxford University Press

Hayhoe M M (2000) Vision using routines A functional account ofvision Visual Cognition 7 43-64

Henderson J M (1993) Visual attention and saccadic eye move-ments In G drsquoYdewalle amp J Van Rensbergen (Eds) Perception andcognition Advances in eye-movement research (pp 37-50) Amster-dam North-Holland

Henderson J M (1997) Transsaccadic memory and integration dur-ing real-world object perception Psychological Science 8 51-55

Henderson J M amp Hollingworth A (1999a) High-level sceneperception Annual Review of Psychology 50 243-271

Henderson J M amp Hollingworth A (1999b) The role of fixationposition in detecting scene changes across saccades PsychologicalScience 10 438-443

894 IRWIN AND ZELINSKY

Henderson J M Pollatsek A amp Rayner K (1989) Covert visualattention and extrafoveal information use during object identifica-tion Perception amp Psychophysics 45 196-208

Hochberg J (1986) Representation of motion and space in video andcinematic displays In K R Boff L Kaufman amp J P Thomas (Eds)Handbook of perception and human performance Vol 1 Sensoryprocesses and perception (pp 221-2264) New York Wiley

Hoffman J E amp Subramaniam B (1995) The role of visual atten-tion in saccadic eye movements Perception amp Psychophysics 57787-795

Hollingworth A amp Henderson J M (2000) Semantic informa-tiveness mediates the detection of changes in natural scenes VisualCognition 7 213-235

Hollingworth A amp Henderson J M (2002) Accurate visualmemory for previously attended objects in natural scenes Journal ofExperimental Psychology Human Perception amp Performance 28113-136

Hollingworth A Williams C C amp Henderson J M (2001) Tosee and remember Visually specific information is retained in mem-ory from previously attended objects in natural scenes PsychonomicBulletin amp Review 8 761-768

Intraub H (1997) The representation of visual scenes Trends in Cog-nitive Sciences 1 217-222

Irwin D E (1991) Information integration across saccadic eye move-ments Cognitive Psychology 23 420-456

Irwin D E (1992) Memory for position and identity across eye move-ments Journal of Experimental Psychology Learning Memory ampCognition 18 307-317

Irwin D E (1996) Integrating information across saccadic eye move-ments Current Directions in Psychological Science 5 94-100

Irwin D E amp Andrews R V (1996) Integration and accumulationof information across saccadic eye movements In T Inui amp J L Mc-Clelland (Eds) Attention and performance XVI Information inte-gration in perception and communication (pp 125-155) CambridgeMA MIT Press

Irwin D E Brown J S amp Sun J-S (1988) Visual masking and vi-sual integration across saccadic eye movements Journal of Experi-mental Psychology General 117 276-287

Irwin D E amp Gordon R D (1998) Eye movements attention andtranssaccadic memory Visual Cognition 5 127-155

Irwin D E Yantis S amp Jonides J (1983) Evidence against visualintegration across saccadic eye movements Perception amp Psycho-physics 34 49-57

Irwin D E Zacks J L amp Brown J S (1990) Visual memory andthe perception of a stable visual environment Perception amp Psycho-physics 47 35-46

Kahneman D amp Treisman A (1984) Changing views of attentionand automaticity In R Parasuraman amp D R Davies (Eds) Varietiesof attention (pp 29-61) Orlando FL Academic Press

Kahneman D Treisman A amp Gibbs B J (1992) The reviewing ofobject f iles Object-specific integration of information CognitivePsychology 24 175-219

Klein R (1980) Does oculomotor readiness mediate cognitive controlof visual attention In R S Nickerson (Ed) Attention and perfor-mance VIII (pp 259-276) Hillsdale NJ Erlbaum

Klein R amp Pontefract A (1994) Does oculomotor readiness me-diate cognitive control of visual attention Revisited In C Umiltagrave ampM Moskovitch (Eds) Attention and performance XV Consciousand nonconscious information processing (pp 333-350) CambridgeMA MIT Press Bradford Books

Kowler E Anderson E Dosher B amp Blaser E (1995) The roleof attention in the programming of saccades Vision Research 351897-1916

Levin D T amp Simons D J (1997) Failure to detect changes to at-tended objects in motion pictures Psychonomic Bulletin amp Review4 501-506

Matin E (1974) Saccadic suppression A review and an analysis Psy-chological Bulletin 81 899-917

McConkie G W amp Currie C B (1996) Visual stability across sac-cades while viewing complex pictures Journal of Experimental Psy-chology Human Perception amp Performance 22 563-581

McConkie G W amp Zola D (1979) Is visual information integratedacross successive fixations in reading Perception amp Psychophysics25 221-224

Mewhort D J K Campbell A J Marchetti F M amp CampbellJ I D (1981) Identif ication localization and ldquoiconic memoryrdquo Anevaluation of the bar-probe task Memory amp Cognition 9 50-67

Mondy S amp Coltheart V (2000) Detection and identification ofchange in naturalistic scenes Visual Cognition 7 281-296

OrsquoRegan J K (1992) Solving the ldquorealrdquo mysteries of visual percep-tion The world as an outside memory Canadian Journal of Psy-chology 46 461-488

OrsquoRegan J K amp Levy-Schoen A (1983) Integrating visual infor-mation from successive fixations Does trans-saccadic fusion existVision Research 23 765-768

Pan K amp Eriksen C W (1993) Attentional distribution in the visualfield during samendashdifferent judgments as assessed by response com-petition Perception amp Psychophysics 53 134-144

Pashler H (1988) Familiarity and visual change detection Percep-tion amp Psychophysics 44 369-378

Phillips W A (1974) On the distinction between sensory storage andshort-term visual memory Perception amp Psychophysics 16 283-290

Pollatsek A Rayner K amp Henderson J (1990) The role of spa-tial location in the integration of pictorial information across sac-cades Journal of Experimental Psychology Human Perception ampPerformance 16 199-210

Pringle H L Irwin D E Kramer A F amp Atchley P (2001)The role of attentional breadth in perceptual change detection Psy-chonomic Bulletin amp Review 8 89-95

Pylyshyn Z W amp Storm R W (1988) Tracking multiple indepen-dent targets Evidence for a parallel tracking mechanism Spatial Vi-sion 3 179-197

Rayner K McConkie G W amp Ehrlich S (1978) Eye movement sand integrating information across f ixations Journal of Experimen-tal Psychology Human Perception amp Performance 4 529-544

Rayner K McConkie G W amp Zola D (1980) Integrating infor-mation across eye movements Cognitive Psychology 12 206-226

Rayner K amp Pollatsek A (1983) Is visual information integratedacross saccades Perception amp Psychophysics 34 39-48

Rensink R A (2000) The dynamic representation of scenes VisualCognition 7 17-42

Rensink R A OrsquoRegan J K amp Clark J J (1997) To see or not tosee The need for attention to perceive changes in scenes Psycho-logical Science 8 368-373

Shepherd M Findlay J amp Hockey R (1986) The relationship be-tween eye movements and spatial attention Quarterly Journal of Ex-perimental Psychology 38A 475-491

Simons D J (1996) In sight out of mind When object representa-tions fail Psychological Science 7 301-305

Simons D J (2000) Current approaches to change blindness VisualCognition 7 1-15

Simons D J amp Levin D T (1997) Change blindness Trends in Cog-nitive Sciences 1 261-267

Simons D J amp Levin D T (1998) Failure to detect changes to peo-ple during a real-world interaction Psychonomic Bulletin amp Review5 644-649

Sperling G (1960) The information available in brief visual presen-tations Psychological Monographs 74(11 Whole No 498)

Van Dijk T amp Kintsch W (1983) Strategies of discourse compre-hension New York Academic Press

Wallis G amp Bulthoff H (2000) Whatrsquos scene and not seen Influ-ences of movement and task upon what we see Visual Cognition 7175-190

Williams P amp Simons D J (2000) Detecting changes in novel 3Dobjects Effects of change magnitude spatiotemporal continuity andstimulus familiarity Visual Cognition 7 297-322

Zelinsky G J (1999) Precuing target location in a variable set sizeldquononsearchrdquo task Dissociating search-based and interference-basedexplanations for set size effects Journal of Experimental Psychol-ogy Human Perception amp Performance 25 875-903

Zelinsky G J (2001) Eye movements during change detection Im-

EYE MOVEMENTS AND MEMORY 895

plications for search constraints memory limitations and scanningstrategies Perception amp Psychophysics 63 209-225

Zelinsky G J amp Loschky L (1998) Toward a realistic assessmentof visual working memory Investigative Ophthalmology amp VisualScience 39 S224

Zelinsky G J Rao R Hayhoe M amp Ballard D (1997) Eyemovements reveal the spatiotemporal dynamics of visual search Psy-chological Science 8 448-453

NOTES

1 The percentage of f ixations summed across positions in Figure 6does not add up to 100 because some fixations did not fall on any ob-ject position This happened relatively rarely for Fixations 3ndash15 rang-ing from 3 to 9 of all the fixations It happened quite often for Fix-ation 2 (the first real f ixation on the scene since Fixation 1 was on thefixation point) accounting for 45 of all the fixations Approximately

67 of these fell near an object position whereas the remainder werescattered randomly in the central area of the scene these may have beenldquocenter of massrdquo fixations like those previously observed by ZelinskyRao Hayhoe and Ballard (1997)

2 In order to determine how many objects would have to be stored inmemory to produce an accuracy level of 80 we first to correct forguessing (Busey amp Loftus 1994) applied the formula p 5 (x 2 g)(1 2g) where x is the raw proportion correct g is the guessing probabilityand p is the corrected proportion correct (note that g 5 143 or 17 be-cause there were always seven objects in the array) We then multipliedp by the number of objects in the display in order to estimate the num-ber of objects remembered (Sperling 1960)

(Manuscript received March 9 2001revision accepted for publication November 28 2001)

886 IRWIN AND ZELINSKY

conditions 54 of the trials were deleted from the3-fixation condition 109 from the 9-fixation condi-tion and 92 from the 15-fixation condition

Scene Viewing Time and Number of ObjectsFoveated

Because the duration of the display on each trial wasdetermined by fixation condition scene viewing time in-creased as the number of fixations allowed on the sceneincreased (see Table 1) The mean viewing times rangedfrom 174 msec in the 1-fixation condition to 4927 msecin the 15-fixation condition As one might expect thenumber of objects foveated in the scene also increasedwith the number of fixations allowed on the scene In the1-fixation condition the scene was erased as soon as theeyes left the fixation box at the bottom of the scene sono objects were foveated in this condition At the otherextreme an average of 51 objects were foveated in the15-fixation condition (see Table 1) The number of ob-jects foveated in each condition was less than the num-ber of fixations because some fixations fell on no objectand some objects were foveated more than once Meanfixation duration tended to increase as the number of fix-ations on the scene increased (see Figure 3) The first 2fixations on the scene were quite short (approximately150 msec in duration) whereas the duration of subse-quent fixations was captured reasonably well (r2 5 77)by the following equation fixation duration 5 2376 155(fixation number)

Given these differences in viewing time and in the num-ber of objects foveated it would be reasonable to expectthat overall accuracy in reporting the probed object wouldincrease across fixation conditions This is discussed next

Overall AccuracyIn Version 1 of the experiment mean accuracy was

360 in the 1-fixation condition 658 in the 3-fixationcondition and 672 in the 5-fixation condition The ef-fect of f ixation condition was significant [F(210) 5419 MSe 5 446 p 001] A planned comparison basedon the error term of the analysis of variance (ANOVA)showed that differences of 78 were significant at the05 level thus all pairwise comparisons were significantexcept for the difference between the 3-fixation condi-tion and the 5-fixation condition In Version 2 of the ex-periment mean accuracy was 545 in the 3-fixationcondition 707 in the 9-fixation condition and 777in the 15-fixation condition The effect of fixation con-dition was significant [F(210) 5 288 MSe 5 294 p 001] A planned comparison based on the error term ofthe ANOVA showed that differences of 63 were sig-nificant at the 05 level thus all pairwise comparisonswere significant A t test comparing the two 3-fixationconditions was not significant [t(10) 5 21 p 05]Mean accuracy at reporting the probed object as a func-tion of the number of fixations on the scene is shown inFigure 4 (the data from the 3-fixation conditions werecombined for this figure)

A number of secondary analyses were done to inves-tigate whether practice or learning had any effect on ac-curacy Because the same objects appeared in differentpositions on every trial it seemed possible that the mem-ory of where an object appeared on trial n 2 1 might in-fluence recall of where that object had appeared on trialn (eg because of proactive interference) There was noevidence that accuracy declined over trials however InVersion 1 of the experiment mean accuracy during thefirst second and last third of the experiment was 527

Table 1Viewing Time and Number of Objects Foveated as a Function

of Fixation Condition

Fixation Condition Viewing Time (msec) Objects Foveated

1 174 0003 739 1255 1360 2689 2810 347

15 4927 510

Figure 3 Mean fixation duration as a function of number of fixations on the scene

EYE MOVEMENTS AND MEMORY 887

551 and 589 In Version 2 of the experiment thecorresponding accuracies were 683 671 and683 There was also no evidence that probing thesame target object (eg the duck) in different positionson successive trials influenced performance accuracywas 626 when the target object on trial n had also beenthe target object (but in a different position) on trial n 21 and accuracy was 621 when different target objectswere probed on trial n and trial n 2 1 (there were no tri-als in which the same target object appeared in the same

probed position on successive trials) In sum there wasno evidence that learning or repetition effects interferedwith memory recall

The analyses that follow were designed to illuminatewhat factors did affect overall accuracy

Accuracy 3 Probe PositionPrevious partial-report experiments have found that

accuracy depends strongly on an itemrsquos position in a dis-play (eg Mewhort Campbell Marchetti amp Campbell

Fixation Condition

Per

cent

age

Cor

rect

100

90

80

70

60

50

40

30

20

10

01 3 5 9 15

Figure 4 Mean accuracy as a function of the number of fixations on the scene (error barsrepresent the standard error of the mean)

1 fixation3 fixations5 fixations9 fixations15 fixations

Position

Per

cent

age

Cor

rect

100

90

80

70

60

50

40

30

20

10

01 2 3 4 5 6 7

Figure 5 Mean accuracy as a function of probe position for each fixation condition

888 IRWIN AND ZELINSKY

1981) Figure 5 shows accuracy as a function of whichposition in the scene was probed for each fixation con-dition Position 1 refers to the leftmost object in thescene Position 7 to the rightmost and Position 4 to theobject directly above the fixation point Accuracy in theone-fixation condition showed a U-shaped function withposition so that objects at the sides of the display wereremembered better than objects at other array positionsThere are probably several factors contributing to this re-sult There is less lateral masking for the two end itemsprevious researchers have found that both retinal acuityand visual attention gradients are not circular but ratherare elongated in the horizontal dimension (eg Pan ampEriksen 1993) and subjects may have shifted their at-tention preferentially to the end items as part of a high-level scanning strategy

Of more interest is what happened to the accuracy 3probe position function when more fixations were madeon the display Recall for example that overall accuracyincreased as the number of fixations increased from 1 to3 fixations on the scene Figure 5 shows that not all arraypositions benefited equally however Rather accuracy forthe center-left items in the display increased more than ac-curacy for the center-right items in the display Increasingthe number of fixations from 3 to 5 had similar effects Incontrast accuracy for the rightmost items in the sceneincreased whereas accuracy for the leftmost items in thescene decreased when the number of fixations on thescene increased from 5 to 9 and from 9 to 15 In generalFigure 5 shows that peak accuracy for positions in thearray occurs early for positions on the left side of thearray and late for positions on the right side of the array

It seemed possible that the changes in accuracy acrossprobe position with increasing number of fixations onthe scene might be due to the subjectsrsquo viewing strategies

Figure 6 shows the percentage of fixations on different ob-ject positions in the scene as a function of fixation num-ber (eg 33 of all second fixations fell on Position 13 of all second fixations fell on Position 2 and so on)1In general the pattern of fixations looks very similar tothe pattern of accuracies shown in Figure 5 The subjectstended to fixate the leftmost items in the array early inscene viewing but shifted to rightward positions later inscene viewing There was a great deal of variability be-tween and even within subjects in terms of their viewingstrategies however Two subjects (1 in Version 1 of theexperiment and 1 in Version 2) fixated the objects fromleft to right on most trials 1 subject fixated the objectsfrom right to left on most trials 1 subject moved from leftto right on half the trials and right to left on the other half1 usually fixated Position 1 then Position 4 then Posi-tion 7 skipping over the other positions 1 usually fixatedPosition 4 first and then moved left 1 usually fixated Po-sition 4 first and then moved right and the remainderused a mixture of these strategies changing from one toanother during the course of the experiment There waslittle difference in accuracy between the 3 subjects whoconsistently used a left-to-right or a right-to-left strategy(591) and the 9 who did not (628)

The similarity between the pattern of accuracies in Fig-ure 5 and the pattern of eye fixations in Figure 6 suggeststhat accuracy was affected by whether (and perhaps when)the probed item was foveated during scene presentationThis was examined directly in the next two analyses

Accuracy for Foveated Versus NonfoveatedObjects

On some trials the subjects foveated the object thathappened to be probed whereas on other trials they didnot Figure 7 shows accuracy for foveated versus non-

Position

Per

cent

age

of F

ixat

ions

40

35

30

25

20

15

10

5

01 2 3 4 5 6 7

Fix 2

Fix 3

Fix 4

Fix 5

Fix 6

Fix 7

Fix 8

Fix 9

Fix 10

Fix 11

Fix 12

Fix 13

Fix 14

Fix 15

Figure 6 The percentage of fixations on each object position as a function of fixation number

EYE MOVEMENTS AND MEMORY 889

foveated targets as a function of the number of fixationson the scene No targets were foveated in the 1-fixationcondition because the scene disappeared during the firstsaccade In the 3- 5- and 9-fixation conditions how-ever accuracy was much higher if the position that wasprobed had been foveated as compared with when it hadnot been foveated during scene presentation [929 vs530 in the 3-fixation case t(11) 5 76 p 001842 vs 565 in the 5-fixation case t(5) 5 26 p 05 858 vs 559 in the 9-fixation case t(5) 5 53p 005] The benefit of foveation (808 vs 744)was not significant when 15 fixations had been made onthe scene however [t(5) 5 10 p 35] An inspectionof Figure 7 suggests that this occurred for two reasonsFirst accuracy for nonfoveated objects increased sub-

stantially (from 530 to 744) as the number of fixa-tions on the scene increased from 3 to 15 presumablybecause with an increasing number of f ixations theeyes were landing near the nonfoveated objects even ifthey were not landing on them second accuracy forfoveated objects decreased as the number of fixations in-creased This suggests that recency of fixation on the tar-get object might also be important (ie the 15-fixationcondition includes fixations on the target further back intime than does the 3-fixation condition) This was ex-amined in the next analysis

Accuracy as a Function of Fixation RecencyFigure 8 shows accuracy for foveated targets only as a

function of when during the trial the object had been

Number of Fixations

Per

cent

age

Cor

rect

100

90

80

70

60

50

40

30

20

10

0

1 3 5 9 15

FoveatedNonfoveated

Figure 7 Mean accuracy for foveated versus nonfoveated targets as a function of thenumber of fixations on the scene (an error bar represents the standard error of themean)

Recency of Fixation

Per

cent

age

Cor

rect

100

90

80

70

60

50

40

30

20

10

00 ndash1 ndash2 ndash3 ndash 4 ndash 5

Figure 8 Mean accuracy for foveated targets as a function only of when during the trialthe target had been foveated (an error bar represents the standard error of the mean)

890 IRWIN AND ZELINSKY

foveated If the object that had been foveated just beforethe critical saccade (ie the saccade that terminated thescene) happened to be probed for report (denoted as fix-ation recency 0 in the figure) the subjects reported itcorrectly on over 90 of the trials If the object probedfor report had been foveated 1 or 2 fixations earlier (cor-responding to fixation recencies 21 and 22 in the fig-ure) accuracy was slightly lower (approximately 80)Probing an object that had been foveated 3 or more fix-ations before the critical saccade resulted in accuraciesthat were considerably lower (approximately 65) andnot much different from the accuracy obtained when anonfoveated object had been probed for report (59)The effect of fixation recency (using six levels of re-cency) was significant [F(525) 5 285 MSe 5 2432p 05] in a one-way repeated measures ANOVA thatwas conducted on the data from Version 2 of the experi-ment (this version allowed 3 9 or 15 fixations on thedisplay and thus provided a wide range of fixation re-cencies) A planned comparison based on the error termof the ANOVA showed that differences of 145 weresignificant at the 05 level thus accuracy for the threeobjects foveated just before the critical saccade (iewith fixation recencies of 0 21 and 22) was signifi-cantly higher than accuracy for objects foveated 3 ormore fixations back These results suggest that fixationrecency does influence accuracy in particular it appearsthat people remember the last three objects foveated inthe scene much better than they remember objectsfoveated earlier in time (see Zelinsky amp Loschky 1998for similar results) There was no evidence of a primacyeffect if the object probed for report was the first objectfoveated (and had been foveated 3 or more fixations be-fore the critical saccade) accuracy was 66

Accuracy as a Function of the Number ofObjects Foveated

Given the importance of foveation to scene memorythat is revealed by the preceding analyses one might ex-pect that overall accuracy would depend on the numberof unique objects that were foveated in the scene (recallthat 7 objects appeared in the scene) This relationship isshown in Figure 9 Accuracy increased as 0ndash6 objectswere foveated but there was no additional increase in ac-curacy from 6 to 7 objects Accuracy peaked at 80 evenwhen 6 or 7 unique objects were foveated An accuracylevel of 80 corresponds to memory for 54 objects inthe scene2 This result suggests that scene representa-tions contain no more than about f ive object positionpairings regardless of the number of f ixations on thescene This is not to say that 5 discrete objects in all theirfull detail are stored on every trial however rather thisestimate represents the average amount of informationstored from what is most likely a stochastic processThus it is probably most accurate to say that approxi-mately 5 objectsrsquo ldquoworthrdquo of information is stored in theon-line scene representation

Accuracy When Final Saccade Was Directed atthe Target Versus Directed Elsewhere

Previous research has shown that movements of atten-tion precede movements of the eyes to locations in spacethereby facilitating the processing of information that ispresent at the saccade target location (eg Deubel ampSchneider 1996 Henderson 1993 Henderson Pollat-sek amp Rayner 1989 Hoffman amp Subramaniam 1995Irwin amp Gordon 1998 Klein 1980 Klein amp Pontefract1994 Kowler Anderson Dosher amp Blaser 1995Rayner McConkie amp Ehrlich 1978 Shepherd Findlay

0 1 2 3 4 5 6 7

100

90

80

70

60

50

40

30

20

10

0

Number of Objects Foveated

Per

cent

age

Cor

rect

Figure 9 Mean accuracy as a function of the number of unique objects that were foveatedduring scene presentation (an error bar represents the standard error of the mean)

EYE MOVEMENTS AND MEMORY 891

amp Hockey 1986) Thus we expected that accuracywould be higher when the critical saccade was directedat the object that happened to be probed on any giventrial as compared with when the critical saccade was di-rected toward an unprobed object To elaborate considera one-fixation trial in which the baby bottle was pre-sented in the leftmost position of the display Supposethat the subject moved hisher eyes from the central fix-ation box toward this position Because this was a one-fixation trial the scene disappeared as soon as the sac-cade was initiated to the bottle so nothing was presenton the screen when the eyes landed A short time laterthe probe was presented at one of the display positionsand on some trials it just happened to be presented at thelocation to which the eyes were sent on the critical sac-cade the question of interest is whether accuracy washigher when the probed item happened to be the itemthat the eyes were sent to on the critical saccade thanwhen it was not

Figure 10 shows accuracy for trials in which theprobed item was or was not the target of the critical sac-cade in each fixation condition (by target of the saccadewe mean that the eyes landed within the virtual 24ordmsquare bounding box that defined object position) Onlytrials in which the probed item was not fixated at anytime during scene presentation are included In everyfixation condition accuracy was much higher when theprobed item was the target of the final saccade than whenit was not [778 vs 309 in the 1-f ixation caset(5) 5 26 p 05 929 vs 515 in the 3-fixationcase t(11) 5 60 p 001 879 vs 513 in the 5-fixation case t(5) 5 36 p 02 998 vs 658 in the9-fixation case t(5) 5 102 p 001 and 983 vs751 in the 15-fixation case t(5) 5 49 p 005]

In sum accuracy at reporting the probed item wasvery high if the probe happened to be presented at the lo-cation to which the eyes had been sent on the critical sac-cade even though the object at that location had neverbeen foveated during the trial merely being the target ofthe final saccade increased its memorability consider-ably (see Henderson amp Hollingworth 1999b for similarfindings) This presumably occurred because attentionpreceded the eyes to the saccade target location The ef-fect of attention is to increase the likelihood that infor-mation at the saccade target location will be encodedinto onersquos mental representation of the scene In fact itis possible that many of the beneficial effects of foveat-ing an object discussed above are not due to foveationper se but rather to the attention shift that preceded it

DISCUSSION

In the last 20 years there has been an explosion of in-terest in the properties of on-line scene representation(eg Aginsky amp Tarr 2000 Blackmore et al 1995De Graef 1992 Grimes 1996 Hayhoe 2000 Hender-son amp Hollingworth 1999a 1999b Hollingworth ampHenderson 2000 2002 Hollingworth et al 2001 In-traub 1997 Irwin 1991 1992 Irwin amp Andrews 1996Irwin et al 1988 Irwin et al 1983 Irwin Zacks ampBrown 1990 Levin amp Simons 1997 McConkie amp Cur-rie 1996 Mondy amp Coltheart 2000 OrsquoRegan 1992OrsquoRegan amp Levy-Schoen 1983 Pringle Irwin Krameramp Atchley 2001 Rayner amp Pollatsek 1983 Rensink2000 Rensink et al 1997 Simons 1996 2000 Simonsamp Levin 1998 Wallis amp Bulthoff 2000) One commonfinding is that scene representations appear to be sparseand undetailed Consistent with previous research the

1 3 5 9

100

90

80

70

60

50

40

30

20

10

0

Number of Fixations

Per

cent

age

Cor

rect

15

DirectedNot Directed

Figure 10 Mean accuracy when the final saccade was directed at the target versusnot directed at the target as a function of the number of fixations on the scene (anerror bar represents the standard error of the mean)

892 IRWIN AND ZELINSKY

results of our experiment also indicate that scene repre-sentations are relatively sparse even after 15 fixationson a scene our subjects remembered the positionidentitypairings of only about 78 of the objects in our seven-object displays or the equivalent of about five objectsrsquoworth of information But more important our resultsprovide new information about what factors influencethe contents of the on-line scene representation and howthis representation changes during the course of sceneviewing In particular our analyses suggest that the lastthree items that are foveated and the item about to befoveated are actually remembered quite well much bet-ter than other items in a scene In other words on-linescene representations are dynamic so that the memorytraces of recently fixated objects are much stronger thanthose of objects viewed several fixations back in timeInformation about a scene does appear to accumulateover multiple fixations but the capacity of the on-linescene representation appears to be limited to about fiveitems In most respects the results of the present studyare consistent with the results of other studies that haveused the transsaccadic partial-report technique to inves-tigate integration across saccades with nonscene dis-plays (eg Irwin 1992 Irwin amp Andrews 1996 Irwinamp Gordon 1998)

What is the nature of the memory representation under-lying performance in our task Given that the same sevenobjects appeared in fixed positions on each trial onemight imagine that subjects would simply adopt a left-to-right reading strategy and would store an ordered list ofobject names in verbal working memory Then when theprobe was presented they would scan their memorizedlist and report the appropriate item Three aspects of ourdata seem inconsistent with this hypothesis howeverFirst as was noted earlier for the most part the subjectsdid not use a left-to-right reading strategy nor did theyeven adopt any consistent strategy across trials Thisvariability would greatly complicate memory retrieval ifthe subjects had stored an ordered list of object names inverbal working memory because there would be no con-sistent mapping between probe position and the positionof an item in verbal working memory That is if thememory probe appeared at Position 6 say this wouldcorrespond to the sixth position in the list if the subjectscanned the array from left to right the second positionin the list if the subject scanned from right to left thethird position if the subject started in the middle and thenmoved right and so on The fact that the subjects did notadopt a fixed scanning order suggests to us that theywere not storing an ordered list of object names in ver-bal working memory Second the fact that the subjectsremembered only a maximum of five objects from thedisplay also seems inconsistent with the verbal workingmemory hypothesis because the capacity of verbalworking memory is generally assumed to be betweenfive and nine items given 5 sec to study the display wewould think that the subjects would be able to remember

more than five items if they were using verbal workingmemory Finally the absence of a primacy effect in ourdata also seems inconsistent with the verbal workingmemory hypothesis

Instead we believe that our results are more consistentwith the object-f ile theory of transsaccadic memory(Irwin 1992 1996 Irwin amp Andrews 1996) This theoryis based on the theoretical framework for object percep-tion proposed by Kahneman and Treisman (1984) andKahneman Treisman and Gibbs (1992) It contains fourlevels of representation feature maps which register in-dependently the presence of different sensory features ina scene such as color and shape a master map of loca-tions which registers where in a scene features are lo-cated temporary object representations called objectfiles which are formed when features are conjoined intounitary wholes via attention and which thus representwhat objects are located where in a scene and an abstractlong-term recognition network that stores descriptions ofobjects along with their names Object files may containvisual verbal and semantic information about objectsbecause they have access to long-term memory as well asto perceptual input According to the object-file theoryof transsaccadic memory relatively little information ispreserved across eye movements Rather transsaccadicmemory (and by extension the on-line scene representa-tion) consists of 3mdash5 object files that contain positionand identity information for items that have been at-tended in the scene and of residual activation in the fea-ture maps and in the long-term recognition network Theresults of the present study are consistent with the object-file theory because even after 15 fixations on a scenememory for the scene was limited and consisted largelyof the last few objects that had been attended in the scene

Although the results of the present study are consis-tent with the object-file theory they seem not entirelyconsistent with another theory of scene representationproposed recently by Rensink (2000) According toRensinkrsquos coherence theory in the absence of focusedattention volatile low-level proto-objects are continuallybeing formed and replaced rapidly and in parallel acrossthe visual field reflecting whatever is present on theretina at any moment in time Focused attention providesa coherence field that selects a small number of theseproto-objects to form a stable individuated object thathas spatiotemporal continuity This stable object doesnot necessarily correspond to a single element in the vi-sual field however but may represent a supra-object (cfPylyshynrsquos FINSTs Pylyshyn amp Storm 1988) After fo-cused attention is released the object loses its coherenceand dissolves back into its constituent proto-object partsOnce that occurs there is little or no consequence of hav-ing been attended According to coherence theory visualshort-term memory corresponds to the coherence fieldit contains object tokens (ie specific instantiations at aparticular place and time) that are currently in the focusof attention but that disappear when attention is with-

EYE MOVEMENTS AND MEMORY 893

drawn The coherence theory does allow for the exis-tence of a nonvisual short-term memory for previouslyattended items but it posits that this memory containsinformation only about object types (general definitionsabstracted away from specific spatiotemporal contexts)

The results of the present study seem to raise problemsfor the coherence theory because object tokens were re-tained in memory for recently fixated objects that wereno longer in the focus of attention Consistent with co-herence theory we found that accuracy was very highwhen the object at the final saccade target location(hence the object at the focus of attention) was probedfor report However we also found that accuracy wasquite high for the three objects that had been fixated justprior to this Since attention movements precede eyemovements in an obligatory fashion the three objectsviewed in prior fixations could not also be at the focus ofattention The fact that accuracy was not uniform for thefour objects in question also indicates that they were notall simultaneously at the focus of attention Rather theobjects viewed during the fixations just preceding thefinal saccade benefited from prior attentional process-ing This is inconsistent with the claim that once focusedattention is released objects dissolve back into their con-stituent proto-objects with no consequence of havingbeen attended Note also that object tokens and not ob-ject types were being remembered in our experiment inorder to respond correctly to the partial-report probe thesubjects had to remember the position and the identityof the probed itemmdashthat is they had to remember an ob-ject token Merely remembering that an object type hadappeared somewhere in the display would not lead to anaccurate response especially since the same seven itemsappeared on every trial

Although the present study provides important infor-mation about the factors that influence the contents of on-line scene representations two limitations should benoted One is that our study measured only short-termmemory contributions to the on-line scene representationand not potential long-term memory contributions Thesame seven objects appeared in the same scene context onevery trial so top-down contributions to performancewere essentially eliminated In the real world scene rep-resentations are bound to be influenced by knowledgeabout what kinds of objects are typically found in a par-ticular scene context and where those objects are usuallylocated Indeed recent research by Hollingworth andHenderson (2002 see also Hollingworth et al 2001) inwhich a novel saccade-contingent change detection pro-cedure was used has demonstrated that long-term mem-ory plays a crucial role in the construction and mainte-nance of on-line scene representations

A second limitation of our study is that it involvedonly a test of explicit memory Recently several investi-gators have found evidence that explicit tests of scenememory may underestimate the amount of informationin the representation (eg Fernandez-Duque amp Thornton2000 Hollingworth et al 2001 Williams amp Simons

2000) For example Hollingworth et al found that chang-ing an object in a scene often affected fixation durationon that object even though subjects failed to report thata change had occurred This suggests that at some levelthe perceptual system detected the change even thoughthe subjects were not explicitly aware of it It is impor-tant to note that the implicit effects that have been ob-served have often been small however Furthermoreeven though implicit measures of performance (eg eyemovements) appear to be more sensitive to change thanare explicit measures it does not follow from this thatscene representations contain a great deal more infor-mation than has been measured by explicit tests Explicittests may underestimate the amount of information in thescene representation but there is no evidence that theyunderestimate it by much This is still an open question

REFERENCES

Aginsky V amp Tarr M (2000) How are different properties of a sceneencoded in visual memory Visual Cognition 7 147-162

Averbach E amp Coriell E (1961) Short-term memory in visionBell System Technical Journal 40 309-328

Blackmore S J Brelstaff G Nelson K amp Troscianko T(1995) Is the richness of our visual world an illusion Transsaccadicmemory for complex scenes Perception 24 1075-1081

Bridgeman B Hendry D amp Stark L (1975) Failure to detect dis-placement of the visual world during saccadic eye movements Vi-sion Research 15 719-722

Bridgeman B amp Mayer M (1983) Failure to integrate visual infor-mation from successive fixations Bulletin of the Psychonomic Soci-ety 21 285-286

Bridgeman B van der Heijden A amp Velichkovsky B (1994) Atheory of visual stability across saccadic eye movements Behavioralamp Brain Sciences 17 247-292

Busey T amp Loftus G (1994) Sensory and cognitive components ofvisual information acquisition Psychological Review 101 446-469

Currie C B McConkie G W Carlson-Radvansky L A ampIrwin D E (2000) The role of the saccade target object in the per-ception of a visually stable world Perception amp Psychophysics 62673-683

De Graef P (1992) Scene-context effects and models of real-worldperception In K Rayner (Ed) Eye movements and visual cognitionScene perception and reading (pp 243-259) New York Springer-Verlag

Deubel H amp Schneider W X (1996) Saccade target selection andobject recognition Evidence for a common attentional mechanismVision Research 36 1827-1837

Fernandez-Duque D amp Thornton I M (2000) Change detectionwithout awareness Do explicit reports underestimate the representa-tion of change in the visual system Visual Cognition 7 324-344

Grimes J (1996) On the failure to detect changes in scenes across sac-cades In K Akins (Ed) Perception (Vancouver Studies in CognitiveScience Vol 5 pp 89-110) New York Oxford University Press

Hayhoe M M (2000) Vision using routines A functional account ofvision Visual Cognition 7 43-64

Henderson J M (1993) Visual attention and saccadic eye move-ments In G drsquoYdewalle amp J Van Rensbergen (Eds) Perception andcognition Advances in eye-movement research (pp 37-50) Amster-dam North-Holland

Henderson J M (1997) Transsaccadic memory and integration dur-ing real-world object perception Psychological Science 8 51-55

Henderson J M amp Hollingworth A (1999a) High-level sceneperception Annual Review of Psychology 50 243-271

Henderson J M amp Hollingworth A (1999b) The role of fixationposition in detecting scene changes across saccades PsychologicalScience 10 438-443

894 IRWIN AND ZELINSKY

Henderson J M Pollatsek A amp Rayner K (1989) Covert visualattention and extrafoveal information use during object identifica-tion Perception amp Psychophysics 45 196-208

Hochberg J (1986) Representation of motion and space in video andcinematic displays In K R Boff L Kaufman amp J P Thomas (Eds)Handbook of perception and human performance Vol 1 Sensoryprocesses and perception (pp 221-2264) New York Wiley

Hoffman J E amp Subramaniam B (1995) The role of visual atten-tion in saccadic eye movements Perception amp Psychophysics 57787-795

Hollingworth A amp Henderson J M (2000) Semantic informa-tiveness mediates the detection of changes in natural scenes VisualCognition 7 213-235

Hollingworth A amp Henderson J M (2002) Accurate visualmemory for previously attended objects in natural scenes Journal ofExperimental Psychology Human Perception amp Performance 28113-136

Hollingworth A Williams C C amp Henderson J M (2001) Tosee and remember Visually specific information is retained in mem-ory from previously attended objects in natural scenes PsychonomicBulletin amp Review 8 761-768

Intraub H (1997) The representation of visual scenes Trends in Cog-nitive Sciences 1 217-222

Irwin D E (1991) Information integration across saccadic eye move-ments Cognitive Psychology 23 420-456

Irwin D E (1992) Memory for position and identity across eye move-ments Journal of Experimental Psychology Learning Memory ampCognition 18 307-317

Irwin D E (1996) Integrating information across saccadic eye move-ments Current Directions in Psychological Science 5 94-100

Irwin D E amp Andrews R V (1996) Integration and accumulationof information across saccadic eye movements In T Inui amp J L Mc-Clelland (Eds) Attention and performance XVI Information inte-gration in perception and communication (pp 125-155) CambridgeMA MIT Press

Irwin D E Brown J S amp Sun J-S (1988) Visual masking and vi-sual integration across saccadic eye movements Journal of Experi-mental Psychology General 117 276-287

Irwin D E amp Gordon R D (1998) Eye movements attention andtranssaccadic memory Visual Cognition 5 127-155

Irwin D E Yantis S amp Jonides J (1983) Evidence against visualintegration across saccadic eye movements Perception amp Psycho-physics 34 49-57

Irwin D E Zacks J L amp Brown J S (1990) Visual memory andthe perception of a stable visual environment Perception amp Psycho-physics 47 35-46

Kahneman D amp Treisman A (1984) Changing views of attentionand automaticity In R Parasuraman amp D R Davies (Eds) Varietiesof attention (pp 29-61) Orlando FL Academic Press

Kahneman D Treisman A amp Gibbs B J (1992) The reviewing ofobject f iles Object-specific integration of information CognitivePsychology 24 175-219

Klein R (1980) Does oculomotor readiness mediate cognitive controlof visual attention In R S Nickerson (Ed) Attention and perfor-mance VIII (pp 259-276) Hillsdale NJ Erlbaum

Klein R amp Pontefract A (1994) Does oculomotor readiness me-diate cognitive control of visual attention Revisited In C Umiltagrave ampM Moskovitch (Eds) Attention and performance XV Consciousand nonconscious information processing (pp 333-350) CambridgeMA MIT Press Bradford Books

Kowler E Anderson E Dosher B amp Blaser E (1995) The roleof attention in the programming of saccades Vision Research 351897-1916

Levin D T amp Simons D J (1997) Failure to detect changes to at-tended objects in motion pictures Psychonomic Bulletin amp Review4 501-506

Matin E (1974) Saccadic suppression A review and an analysis Psy-chological Bulletin 81 899-917

McConkie G W amp Currie C B (1996) Visual stability across sac-cades while viewing complex pictures Journal of Experimental Psy-chology Human Perception amp Performance 22 563-581

McConkie G W amp Zola D (1979) Is visual information integratedacross successive fixations in reading Perception amp Psychophysics25 221-224

Mewhort D J K Campbell A J Marchetti F M amp CampbellJ I D (1981) Identif ication localization and ldquoiconic memoryrdquo Anevaluation of the bar-probe task Memory amp Cognition 9 50-67

Mondy S amp Coltheart V (2000) Detection and identification ofchange in naturalistic scenes Visual Cognition 7 281-296

OrsquoRegan J K (1992) Solving the ldquorealrdquo mysteries of visual percep-tion The world as an outside memory Canadian Journal of Psy-chology 46 461-488

OrsquoRegan J K amp Levy-Schoen A (1983) Integrating visual infor-mation from successive fixations Does trans-saccadic fusion existVision Research 23 765-768

Pan K amp Eriksen C W (1993) Attentional distribution in the visualfield during samendashdifferent judgments as assessed by response com-petition Perception amp Psychophysics 53 134-144

Pashler H (1988) Familiarity and visual change detection Percep-tion amp Psychophysics 44 369-378

Phillips W A (1974) On the distinction between sensory storage andshort-term visual memory Perception amp Psychophysics 16 283-290

Pollatsek A Rayner K amp Henderson J (1990) The role of spa-tial location in the integration of pictorial information across sac-cades Journal of Experimental Psychology Human Perception ampPerformance 16 199-210

Pringle H L Irwin D E Kramer A F amp Atchley P (2001)The role of attentional breadth in perceptual change detection Psy-chonomic Bulletin amp Review 8 89-95

Pylyshyn Z W amp Storm R W (1988) Tracking multiple indepen-dent targets Evidence for a parallel tracking mechanism Spatial Vi-sion 3 179-197

Rayner K McConkie G W amp Ehrlich S (1978) Eye movement sand integrating information across f ixations Journal of Experimen-tal Psychology Human Perception amp Performance 4 529-544

Rayner K McConkie G W amp Zola D (1980) Integrating infor-mation across eye movements Cognitive Psychology 12 206-226

Rayner K amp Pollatsek A (1983) Is visual information integratedacross saccades Perception amp Psychophysics 34 39-48

Rensink R A (2000) The dynamic representation of scenes VisualCognition 7 17-42

Rensink R A OrsquoRegan J K amp Clark J J (1997) To see or not tosee The need for attention to perceive changes in scenes Psycho-logical Science 8 368-373

Shepherd M Findlay J amp Hockey R (1986) The relationship be-tween eye movements and spatial attention Quarterly Journal of Ex-perimental Psychology 38A 475-491

Simons D J (1996) In sight out of mind When object representa-tions fail Psychological Science 7 301-305

Simons D J (2000) Current approaches to change blindness VisualCognition 7 1-15

Simons D J amp Levin D T (1997) Change blindness Trends in Cog-nitive Sciences 1 261-267

Simons D J amp Levin D T (1998) Failure to detect changes to peo-ple during a real-world interaction Psychonomic Bulletin amp Review5 644-649

Sperling G (1960) The information available in brief visual presen-tations Psychological Monographs 74(11 Whole No 498)

Van Dijk T amp Kintsch W (1983) Strategies of discourse compre-hension New York Academic Press

Wallis G amp Bulthoff H (2000) Whatrsquos scene and not seen Influ-ences of movement and task upon what we see Visual Cognition 7175-190

Williams P amp Simons D J (2000) Detecting changes in novel 3Dobjects Effects of change magnitude spatiotemporal continuity andstimulus familiarity Visual Cognition 7 297-322

Zelinsky G J (1999) Precuing target location in a variable set sizeldquononsearchrdquo task Dissociating search-based and interference-basedexplanations for set size effects Journal of Experimental Psychol-ogy Human Perception amp Performance 25 875-903

Zelinsky G J (2001) Eye movements during change detection Im-

EYE MOVEMENTS AND MEMORY 895

plications for search constraints memory limitations and scanningstrategies Perception amp Psychophysics 63 209-225

Zelinsky G J amp Loschky L (1998) Toward a realistic assessmentof visual working memory Investigative Ophthalmology amp VisualScience 39 S224

Zelinsky G J Rao R Hayhoe M amp Ballard D (1997) Eyemovements reveal the spatiotemporal dynamics of visual search Psy-chological Science 8 448-453

NOTES

1 The percentage of f ixations summed across positions in Figure 6does not add up to 100 because some fixations did not fall on any ob-ject position This happened relatively rarely for Fixations 3ndash15 rang-ing from 3 to 9 of all the fixations It happened quite often for Fix-ation 2 (the first real f ixation on the scene since Fixation 1 was on thefixation point) accounting for 45 of all the fixations Approximately

67 of these fell near an object position whereas the remainder werescattered randomly in the central area of the scene these may have beenldquocenter of massrdquo fixations like those previously observed by ZelinskyRao Hayhoe and Ballard (1997)

2 In order to determine how many objects would have to be stored inmemory to produce an accuracy level of 80 we first to correct forguessing (Busey amp Loftus 1994) applied the formula p 5 (x 2 g)(1 2g) where x is the raw proportion correct g is the guessing probabilityand p is the corrected proportion correct (note that g 5 143 or 17 be-cause there were always seven objects in the array) We then multipliedp by the number of objects in the display in order to estimate the num-ber of objects remembered (Sperling 1960)

(Manuscript received March 9 2001revision accepted for publication November 28 2001)

EYE MOVEMENTS AND MEMORY 887

551 and 589 In Version 2 of the experiment thecorresponding accuracies were 683 671 and683 There was also no evidence that probing thesame target object (eg the duck) in different positionson successive trials influenced performance accuracywas 626 when the target object on trial n had also beenthe target object (but in a different position) on trial n 21 and accuracy was 621 when different target objectswere probed on trial n and trial n 2 1 (there were no tri-als in which the same target object appeared in the same

probed position on successive trials) In sum there wasno evidence that learning or repetition effects interferedwith memory recall

The analyses that follow were designed to illuminatewhat factors did affect overall accuracy

Accuracy 3 Probe PositionPrevious partial-report experiments have found that

accuracy depends strongly on an itemrsquos position in a dis-play (eg Mewhort Campbell Marchetti amp Campbell

Fixation Condition

Per

cent

age

Cor

rect

100

90

80

70

60

50

40

30

20

10

01 3 5 9 15

Figure 4 Mean accuracy as a function of the number of fixations on the scene (error barsrepresent the standard error of the mean)

1 fixation3 fixations5 fixations9 fixations15 fixations

Position

Per

cent

age

Cor

rect

100

90

80

70

60

50

40

30

20

10

01 2 3 4 5 6 7

Figure 5 Mean accuracy as a function of probe position for each fixation condition

888 IRWIN AND ZELINSKY

1981) Figure 5 shows accuracy as a function of whichposition in the scene was probed for each fixation con-dition Position 1 refers to the leftmost object in thescene Position 7 to the rightmost and Position 4 to theobject directly above the fixation point Accuracy in theone-fixation condition showed a U-shaped function withposition so that objects at the sides of the display wereremembered better than objects at other array positionsThere are probably several factors contributing to this re-sult There is less lateral masking for the two end itemsprevious researchers have found that both retinal acuityand visual attention gradients are not circular but ratherare elongated in the horizontal dimension (eg Pan ampEriksen 1993) and subjects may have shifted their at-tention preferentially to the end items as part of a high-level scanning strategy

Of more interest is what happened to the accuracy 3probe position function when more fixations were madeon the display Recall for example that overall accuracyincreased as the number of fixations increased from 1 to3 fixations on the scene Figure 5 shows that not all arraypositions benefited equally however Rather accuracy forthe center-left items in the display increased more than ac-curacy for the center-right items in the display Increasingthe number of fixations from 3 to 5 had similar effects Incontrast accuracy for the rightmost items in the sceneincreased whereas accuracy for the leftmost items in thescene decreased when the number of fixations on thescene increased from 5 to 9 and from 9 to 15 In generalFigure 5 shows that peak accuracy for positions in thearray occurs early for positions on the left side of thearray and late for positions on the right side of the array

It seemed possible that the changes in accuracy acrossprobe position with increasing number of fixations onthe scene might be due to the subjectsrsquo viewing strategies

Figure 6 shows the percentage of fixations on different ob-ject positions in the scene as a function of fixation num-ber (eg 33 of all second fixations fell on Position 13 of all second fixations fell on Position 2 and so on)1In general the pattern of fixations looks very similar tothe pattern of accuracies shown in Figure 5 The subjectstended to fixate the leftmost items in the array early inscene viewing but shifted to rightward positions later inscene viewing There was a great deal of variability be-tween and even within subjects in terms of their viewingstrategies however Two subjects (1 in Version 1 of theexperiment and 1 in Version 2) fixated the objects fromleft to right on most trials 1 subject fixated the objectsfrom right to left on most trials 1 subject moved from leftto right on half the trials and right to left on the other half1 usually fixated Position 1 then Position 4 then Posi-tion 7 skipping over the other positions 1 usually fixatedPosition 4 first and then moved left 1 usually fixated Po-sition 4 first and then moved right and the remainderused a mixture of these strategies changing from one toanother during the course of the experiment There waslittle difference in accuracy between the 3 subjects whoconsistently used a left-to-right or a right-to-left strategy(591) and the 9 who did not (628)

The similarity between the pattern of accuracies in Fig-ure 5 and the pattern of eye fixations in Figure 6 suggeststhat accuracy was affected by whether (and perhaps when)the probed item was foveated during scene presentationThis was examined directly in the next two analyses

Accuracy for Foveated Versus NonfoveatedObjects

On some trials the subjects foveated the object thathappened to be probed whereas on other trials they didnot Figure 7 shows accuracy for foveated versus non-

Position

Per

cent

age

of F

ixat

ions

40

35

30

25

20

15

10

5

01 2 3 4 5 6 7

Fix 2

Fix 3

Fix 4

Fix 5

Fix 6

Fix 7

Fix 8

Fix 9

Fix 10

Fix 11

Fix 12

Fix 13

Fix 14

Fix 15

Figure 6 The percentage of fixations on each object position as a function of fixation number

EYE MOVEMENTS AND MEMORY 889

foveated targets as a function of the number of fixationson the scene No targets were foveated in the 1-fixationcondition because the scene disappeared during the firstsaccade In the 3- 5- and 9-fixation conditions how-ever accuracy was much higher if the position that wasprobed had been foveated as compared with when it hadnot been foveated during scene presentation [929 vs530 in the 3-fixation case t(11) 5 76 p 001842 vs 565 in the 5-fixation case t(5) 5 26 p 05 858 vs 559 in the 9-fixation case t(5) 5 53p 005] The benefit of foveation (808 vs 744)was not significant when 15 fixations had been made onthe scene however [t(5) 5 10 p 35] An inspectionof Figure 7 suggests that this occurred for two reasonsFirst accuracy for nonfoveated objects increased sub-

stantially (from 530 to 744) as the number of fixa-tions on the scene increased from 3 to 15 presumablybecause with an increasing number of f ixations theeyes were landing near the nonfoveated objects even ifthey were not landing on them second accuracy forfoveated objects decreased as the number of fixations in-creased This suggests that recency of fixation on the tar-get object might also be important (ie the 15-fixationcondition includes fixations on the target further back intime than does the 3-fixation condition) This was ex-amined in the next analysis

Accuracy as a Function of Fixation RecencyFigure 8 shows accuracy for foveated targets only as a

function of when during the trial the object had been

Number of Fixations

Per

cent

age

Cor

rect

100

90

80

70

60

50

40

30

20

10

0

1 3 5 9 15

FoveatedNonfoveated

Figure 7 Mean accuracy for foveated versus nonfoveated targets as a function of thenumber of fixations on the scene (an error bar represents the standard error of themean)

Recency of Fixation

Per

cent

age

Cor

rect

100

90

80

70

60

50

40

30

20

10

00 ndash1 ndash2 ndash3 ndash 4 ndash 5

Figure 8 Mean accuracy for foveated targets as a function only of when during the trialthe target had been foveated (an error bar represents the standard error of the mean)

890 IRWIN AND ZELINSKY

foveated If the object that had been foveated just beforethe critical saccade (ie the saccade that terminated thescene) happened to be probed for report (denoted as fix-ation recency 0 in the figure) the subjects reported itcorrectly on over 90 of the trials If the object probedfor report had been foveated 1 or 2 fixations earlier (cor-responding to fixation recencies 21 and 22 in the fig-ure) accuracy was slightly lower (approximately 80)Probing an object that had been foveated 3 or more fix-ations before the critical saccade resulted in accuraciesthat were considerably lower (approximately 65) andnot much different from the accuracy obtained when anonfoveated object had been probed for report (59)The effect of fixation recency (using six levels of re-cency) was significant [F(525) 5 285 MSe 5 2432p 05] in a one-way repeated measures ANOVA thatwas conducted on the data from Version 2 of the experi-ment (this version allowed 3 9 or 15 fixations on thedisplay and thus provided a wide range of fixation re-cencies) A planned comparison based on the error termof the ANOVA showed that differences of 145 weresignificant at the 05 level thus accuracy for the threeobjects foveated just before the critical saccade (iewith fixation recencies of 0 21 and 22) was signifi-cantly higher than accuracy for objects foveated 3 ormore fixations back These results suggest that fixationrecency does influence accuracy in particular it appearsthat people remember the last three objects foveated inthe scene much better than they remember objectsfoveated earlier in time (see Zelinsky amp Loschky 1998for similar results) There was no evidence of a primacyeffect if the object probed for report was the first objectfoveated (and had been foveated 3 or more fixations be-fore the critical saccade) accuracy was 66

Accuracy as a Function of the Number ofObjects Foveated

Given the importance of foveation to scene memorythat is revealed by the preceding analyses one might ex-pect that overall accuracy would depend on the numberof unique objects that were foveated in the scene (recallthat 7 objects appeared in the scene) This relationship isshown in Figure 9 Accuracy increased as 0ndash6 objectswere foveated but there was no additional increase in ac-curacy from 6 to 7 objects Accuracy peaked at 80 evenwhen 6 or 7 unique objects were foveated An accuracylevel of 80 corresponds to memory for 54 objects inthe scene2 This result suggests that scene representa-tions contain no more than about f ive object positionpairings regardless of the number of f ixations on thescene This is not to say that 5 discrete objects in all theirfull detail are stored on every trial however rather thisestimate represents the average amount of informationstored from what is most likely a stochastic processThus it is probably most accurate to say that approxi-mately 5 objectsrsquo ldquoworthrdquo of information is stored in theon-line scene representation

Accuracy When Final Saccade Was Directed atthe Target Versus Directed Elsewhere

Previous research has shown that movements of atten-tion precede movements of the eyes to locations in spacethereby facilitating the processing of information that ispresent at the saccade target location (eg Deubel ampSchneider 1996 Henderson 1993 Henderson Pollat-sek amp Rayner 1989 Hoffman amp Subramaniam 1995Irwin amp Gordon 1998 Klein 1980 Klein amp Pontefract1994 Kowler Anderson Dosher amp Blaser 1995Rayner McConkie amp Ehrlich 1978 Shepherd Findlay

0 1 2 3 4 5 6 7

100

90

80

70

60

50

40

30

20

10

0

Number of Objects Foveated

Per

cent

age

Cor

rect

Figure 9 Mean accuracy as a function of the number of unique objects that were foveatedduring scene presentation (an error bar represents the standard error of the mean)

EYE MOVEMENTS AND MEMORY 891

amp Hockey 1986) Thus we expected that accuracywould be higher when the critical saccade was directedat the object that happened to be probed on any giventrial as compared with when the critical saccade was di-rected toward an unprobed object To elaborate considera one-fixation trial in which the baby bottle was pre-sented in the leftmost position of the display Supposethat the subject moved hisher eyes from the central fix-ation box toward this position Because this was a one-fixation trial the scene disappeared as soon as the sac-cade was initiated to the bottle so nothing was presenton the screen when the eyes landed A short time laterthe probe was presented at one of the display positionsand on some trials it just happened to be presented at thelocation to which the eyes were sent on the critical sac-cade the question of interest is whether accuracy washigher when the probed item happened to be the itemthat the eyes were sent to on the critical saccade thanwhen it was not

Figure 10 shows accuracy for trials in which theprobed item was or was not the target of the critical sac-cade in each fixation condition (by target of the saccadewe mean that the eyes landed within the virtual 24ordmsquare bounding box that defined object position) Onlytrials in which the probed item was not fixated at anytime during scene presentation are included In everyfixation condition accuracy was much higher when theprobed item was the target of the final saccade than whenit was not [778 vs 309 in the 1-f ixation caset(5) 5 26 p 05 929 vs 515 in the 3-fixationcase t(11) 5 60 p 001 879 vs 513 in the 5-fixation case t(5) 5 36 p 02 998 vs 658 in the9-fixation case t(5) 5 102 p 001 and 983 vs751 in the 15-fixation case t(5) 5 49 p 005]

In sum accuracy at reporting the probed item wasvery high if the probe happened to be presented at the lo-cation to which the eyes had been sent on the critical sac-cade even though the object at that location had neverbeen foveated during the trial merely being the target ofthe final saccade increased its memorability consider-ably (see Henderson amp Hollingworth 1999b for similarfindings) This presumably occurred because attentionpreceded the eyes to the saccade target location The ef-fect of attention is to increase the likelihood that infor-mation at the saccade target location will be encodedinto onersquos mental representation of the scene In fact itis possible that many of the beneficial effects of foveat-ing an object discussed above are not due to foveationper se but rather to the attention shift that preceded it

DISCUSSION

In the last 20 years there has been an explosion of in-terest in the properties of on-line scene representation(eg Aginsky amp Tarr 2000 Blackmore et al 1995De Graef 1992 Grimes 1996 Hayhoe 2000 Hender-son amp Hollingworth 1999a 1999b Hollingworth ampHenderson 2000 2002 Hollingworth et al 2001 In-traub 1997 Irwin 1991 1992 Irwin amp Andrews 1996Irwin et al 1988 Irwin et al 1983 Irwin Zacks ampBrown 1990 Levin amp Simons 1997 McConkie amp Cur-rie 1996 Mondy amp Coltheart 2000 OrsquoRegan 1992OrsquoRegan amp Levy-Schoen 1983 Pringle Irwin Krameramp Atchley 2001 Rayner amp Pollatsek 1983 Rensink2000 Rensink et al 1997 Simons 1996 2000 Simonsamp Levin 1998 Wallis amp Bulthoff 2000) One commonfinding is that scene representations appear to be sparseand undetailed Consistent with previous research the

1 3 5 9

100

90

80

70

60

50

40

30

20

10

0

Number of Fixations

Per

cent

age

Cor

rect

15

DirectedNot Directed

Figure 10 Mean accuracy when the final saccade was directed at the target versusnot directed at the target as a function of the number of fixations on the scene (anerror bar represents the standard error of the mean)

892 IRWIN AND ZELINSKY

results of our experiment also indicate that scene repre-sentations are relatively sparse even after 15 fixationson a scene our subjects remembered the positionidentitypairings of only about 78 of the objects in our seven-object displays or the equivalent of about five objectsrsquoworth of information But more important our resultsprovide new information about what factors influencethe contents of the on-line scene representation and howthis representation changes during the course of sceneviewing In particular our analyses suggest that the lastthree items that are foveated and the item about to befoveated are actually remembered quite well much bet-ter than other items in a scene In other words on-linescene representations are dynamic so that the memorytraces of recently fixated objects are much stronger thanthose of objects viewed several fixations back in timeInformation about a scene does appear to accumulateover multiple fixations but the capacity of the on-linescene representation appears to be limited to about fiveitems In most respects the results of the present studyare consistent with the results of other studies that haveused the transsaccadic partial-report technique to inves-tigate integration across saccades with nonscene dis-plays (eg Irwin 1992 Irwin amp Andrews 1996 Irwinamp Gordon 1998)

What is the nature of the memory representation under-lying performance in our task Given that the same sevenobjects appeared in fixed positions on each trial onemight imagine that subjects would simply adopt a left-to-right reading strategy and would store an ordered list ofobject names in verbal working memory Then when theprobe was presented they would scan their memorizedlist and report the appropriate item Three aspects of ourdata seem inconsistent with this hypothesis howeverFirst as was noted earlier for the most part the subjectsdid not use a left-to-right reading strategy nor did theyeven adopt any consistent strategy across trials Thisvariability would greatly complicate memory retrieval ifthe subjects had stored an ordered list of object names inverbal working memory because there would be no con-sistent mapping between probe position and the positionof an item in verbal working memory That is if thememory probe appeared at Position 6 say this wouldcorrespond to the sixth position in the list if the subjectscanned the array from left to right the second positionin the list if the subject scanned from right to left thethird position if the subject started in the middle and thenmoved right and so on The fact that the subjects did notadopt a fixed scanning order suggests to us that theywere not storing an ordered list of object names in ver-bal working memory Second the fact that the subjectsremembered only a maximum of five objects from thedisplay also seems inconsistent with the verbal workingmemory hypothesis because the capacity of verbalworking memory is generally assumed to be betweenfive and nine items given 5 sec to study the display wewould think that the subjects would be able to remember

more than five items if they were using verbal workingmemory Finally the absence of a primacy effect in ourdata also seems inconsistent with the verbal workingmemory hypothesis

Instead we believe that our results are more consistentwith the object-f ile theory of transsaccadic memory(Irwin 1992 1996 Irwin amp Andrews 1996) This theoryis based on the theoretical framework for object percep-tion proposed by Kahneman and Treisman (1984) andKahneman Treisman and Gibbs (1992) It contains fourlevels of representation feature maps which register in-dependently the presence of different sensory features ina scene such as color and shape a master map of loca-tions which registers where in a scene features are lo-cated temporary object representations called objectfiles which are formed when features are conjoined intounitary wholes via attention and which thus representwhat objects are located where in a scene and an abstractlong-term recognition network that stores descriptions ofobjects along with their names Object files may containvisual verbal and semantic information about objectsbecause they have access to long-term memory as well asto perceptual input According to the object-file theoryof transsaccadic memory relatively little information ispreserved across eye movements Rather transsaccadicmemory (and by extension the on-line scene representa-tion) consists of 3mdash5 object files that contain positionand identity information for items that have been at-tended in the scene and of residual activation in the fea-ture maps and in the long-term recognition network Theresults of the present study are consistent with the object-file theory because even after 15 fixations on a scenememory for the scene was limited and consisted largelyof the last few objects that had been attended in the scene

Although the results of the present study are consis-tent with the object-file theory they seem not entirelyconsistent with another theory of scene representationproposed recently by Rensink (2000) According toRensinkrsquos coherence theory in the absence of focusedattention volatile low-level proto-objects are continuallybeing formed and replaced rapidly and in parallel acrossthe visual field reflecting whatever is present on theretina at any moment in time Focused attention providesa coherence field that selects a small number of theseproto-objects to form a stable individuated object thathas spatiotemporal continuity This stable object doesnot necessarily correspond to a single element in the vi-sual field however but may represent a supra-object (cfPylyshynrsquos FINSTs Pylyshyn amp Storm 1988) After fo-cused attention is released the object loses its coherenceand dissolves back into its constituent proto-object partsOnce that occurs there is little or no consequence of hav-ing been attended According to coherence theory visualshort-term memory corresponds to the coherence fieldit contains object tokens (ie specific instantiations at aparticular place and time) that are currently in the focusof attention but that disappear when attention is with-

EYE MOVEMENTS AND MEMORY 893

drawn The coherence theory does allow for the exis-tence of a nonvisual short-term memory for previouslyattended items but it posits that this memory containsinformation only about object types (general definitionsabstracted away from specific spatiotemporal contexts)

The results of the present study seem to raise problemsfor the coherence theory because object tokens were re-tained in memory for recently fixated objects that wereno longer in the focus of attention Consistent with co-herence theory we found that accuracy was very highwhen the object at the final saccade target location(hence the object at the focus of attention) was probedfor report However we also found that accuracy wasquite high for the three objects that had been fixated justprior to this Since attention movements precede eyemovements in an obligatory fashion the three objectsviewed in prior fixations could not also be at the focus ofattention The fact that accuracy was not uniform for thefour objects in question also indicates that they were notall simultaneously at the focus of attention Rather theobjects viewed during the fixations just preceding thefinal saccade benefited from prior attentional process-ing This is inconsistent with the claim that once focusedattention is released objects dissolve back into their con-stituent proto-objects with no consequence of havingbeen attended Note also that object tokens and not ob-ject types were being remembered in our experiment inorder to respond correctly to the partial-report probe thesubjects had to remember the position and the identityof the probed itemmdashthat is they had to remember an ob-ject token Merely remembering that an object type hadappeared somewhere in the display would not lead to anaccurate response especially since the same seven itemsappeared on every trial

Although the present study provides important infor-mation about the factors that influence the contents of on-line scene representations two limitations should benoted One is that our study measured only short-termmemory contributions to the on-line scene representationand not potential long-term memory contributions Thesame seven objects appeared in the same scene context onevery trial so top-down contributions to performancewere essentially eliminated In the real world scene rep-resentations are bound to be influenced by knowledgeabout what kinds of objects are typically found in a par-ticular scene context and where those objects are usuallylocated Indeed recent research by Hollingworth andHenderson (2002 see also Hollingworth et al 2001) inwhich a novel saccade-contingent change detection pro-cedure was used has demonstrated that long-term mem-ory plays a crucial role in the construction and mainte-nance of on-line scene representations

A second limitation of our study is that it involvedonly a test of explicit memory Recently several investi-gators have found evidence that explicit tests of scenememory may underestimate the amount of informationin the representation (eg Fernandez-Duque amp Thornton2000 Hollingworth et al 2001 Williams amp Simons

2000) For example Hollingworth et al found that chang-ing an object in a scene often affected fixation durationon that object even though subjects failed to report thata change had occurred This suggests that at some levelthe perceptual system detected the change even thoughthe subjects were not explicitly aware of it It is impor-tant to note that the implicit effects that have been ob-served have often been small however Furthermoreeven though implicit measures of performance (eg eyemovements) appear to be more sensitive to change thanare explicit measures it does not follow from this thatscene representations contain a great deal more infor-mation than has been measured by explicit tests Explicittests may underestimate the amount of information in thescene representation but there is no evidence that theyunderestimate it by much This is still an open question

REFERENCES

Aginsky V amp Tarr M (2000) How are different properties of a sceneencoded in visual memory Visual Cognition 7 147-162

Averbach E amp Coriell E (1961) Short-term memory in visionBell System Technical Journal 40 309-328

Blackmore S J Brelstaff G Nelson K amp Troscianko T(1995) Is the richness of our visual world an illusion Transsaccadicmemory for complex scenes Perception 24 1075-1081

Bridgeman B Hendry D amp Stark L (1975) Failure to detect dis-placement of the visual world during saccadic eye movements Vi-sion Research 15 719-722

Bridgeman B amp Mayer M (1983) Failure to integrate visual infor-mation from successive fixations Bulletin of the Psychonomic Soci-ety 21 285-286

Bridgeman B van der Heijden A amp Velichkovsky B (1994) Atheory of visual stability across saccadic eye movements Behavioralamp Brain Sciences 17 247-292

Busey T amp Loftus G (1994) Sensory and cognitive components ofvisual information acquisition Psychological Review 101 446-469

Currie C B McConkie G W Carlson-Radvansky L A ampIrwin D E (2000) The role of the saccade target object in the per-ception of a visually stable world Perception amp Psychophysics 62673-683

De Graef P (1992) Scene-context effects and models of real-worldperception In K Rayner (Ed) Eye movements and visual cognitionScene perception and reading (pp 243-259) New York Springer-Verlag

Deubel H amp Schneider W X (1996) Saccade target selection andobject recognition Evidence for a common attentional mechanismVision Research 36 1827-1837

Fernandez-Duque D amp Thornton I M (2000) Change detectionwithout awareness Do explicit reports underestimate the representa-tion of change in the visual system Visual Cognition 7 324-344

Grimes J (1996) On the failure to detect changes in scenes across sac-cades In K Akins (Ed) Perception (Vancouver Studies in CognitiveScience Vol 5 pp 89-110) New York Oxford University Press

Hayhoe M M (2000) Vision using routines A functional account ofvision Visual Cognition 7 43-64

Henderson J M (1993) Visual attention and saccadic eye move-ments In G drsquoYdewalle amp J Van Rensbergen (Eds) Perception andcognition Advances in eye-movement research (pp 37-50) Amster-dam North-Holland

Henderson J M (1997) Transsaccadic memory and integration dur-ing real-world object perception Psychological Science 8 51-55

Henderson J M amp Hollingworth A (1999a) High-level sceneperception Annual Review of Psychology 50 243-271

Henderson J M amp Hollingworth A (1999b) The role of fixationposition in detecting scene changes across saccades PsychologicalScience 10 438-443

894 IRWIN AND ZELINSKY

Henderson J M Pollatsek A amp Rayner K (1989) Covert visualattention and extrafoveal information use during object identifica-tion Perception amp Psychophysics 45 196-208

Hochberg J (1986) Representation of motion and space in video andcinematic displays In K R Boff L Kaufman amp J P Thomas (Eds)Handbook of perception and human performance Vol 1 Sensoryprocesses and perception (pp 221-2264) New York Wiley

Hoffman J E amp Subramaniam B (1995) The role of visual atten-tion in saccadic eye movements Perception amp Psychophysics 57787-795

Hollingworth A amp Henderson J M (2000) Semantic informa-tiveness mediates the detection of changes in natural scenes VisualCognition 7 213-235

Hollingworth A amp Henderson J M (2002) Accurate visualmemory for previously attended objects in natural scenes Journal ofExperimental Psychology Human Perception amp Performance 28113-136

Hollingworth A Williams C C amp Henderson J M (2001) Tosee and remember Visually specific information is retained in mem-ory from previously attended objects in natural scenes PsychonomicBulletin amp Review 8 761-768

Intraub H (1997) The representation of visual scenes Trends in Cog-nitive Sciences 1 217-222

Irwin D E (1991) Information integration across saccadic eye move-ments Cognitive Psychology 23 420-456

Irwin D E (1992) Memory for position and identity across eye move-ments Journal of Experimental Psychology Learning Memory ampCognition 18 307-317

Irwin D E (1996) Integrating information across saccadic eye move-ments Current Directions in Psychological Science 5 94-100

Irwin D E amp Andrews R V (1996) Integration and accumulationof information across saccadic eye movements In T Inui amp J L Mc-Clelland (Eds) Attention and performance XVI Information inte-gration in perception and communication (pp 125-155) CambridgeMA MIT Press

Irwin D E Brown J S amp Sun J-S (1988) Visual masking and vi-sual integration across saccadic eye movements Journal of Experi-mental Psychology General 117 276-287

Irwin D E amp Gordon R D (1998) Eye movements attention andtranssaccadic memory Visual Cognition 5 127-155

Irwin D E Yantis S amp Jonides J (1983) Evidence against visualintegration across saccadic eye movements Perception amp Psycho-physics 34 49-57

Irwin D E Zacks J L amp Brown J S (1990) Visual memory andthe perception of a stable visual environment Perception amp Psycho-physics 47 35-46

Kahneman D amp Treisman A (1984) Changing views of attentionand automaticity In R Parasuraman amp D R Davies (Eds) Varietiesof attention (pp 29-61) Orlando FL Academic Press

Kahneman D Treisman A amp Gibbs B J (1992) The reviewing ofobject f iles Object-specific integration of information CognitivePsychology 24 175-219

Klein R (1980) Does oculomotor readiness mediate cognitive controlof visual attention In R S Nickerson (Ed) Attention and perfor-mance VIII (pp 259-276) Hillsdale NJ Erlbaum

Klein R amp Pontefract A (1994) Does oculomotor readiness me-diate cognitive control of visual attention Revisited In C Umiltagrave ampM Moskovitch (Eds) Attention and performance XV Consciousand nonconscious information processing (pp 333-350) CambridgeMA MIT Press Bradford Books

Kowler E Anderson E Dosher B amp Blaser E (1995) The roleof attention in the programming of saccades Vision Research 351897-1916

Levin D T amp Simons D J (1997) Failure to detect changes to at-tended objects in motion pictures Psychonomic Bulletin amp Review4 501-506

Matin E (1974) Saccadic suppression A review and an analysis Psy-chological Bulletin 81 899-917

McConkie G W amp Currie C B (1996) Visual stability across sac-cades while viewing complex pictures Journal of Experimental Psy-chology Human Perception amp Performance 22 563-581

McConkie G W amp Zola D (1979) Is visual information integratedacross successive fixations in reading Perception amp Psychophysics25 221-224

Mewhort D J K Campbell A J Marchetti F M amp CampbellJ I D (1981) Identif ication localization and ldquoiconic memoryrdquo Anevaluation of the bar-probe task Memory amp Cognition 9 50-67

Mondy S amp Coltheart V (2000) Detection and identification ofchange in naturalistic scenes Visual Cognition 7 281-296

OrsquoRegan J K (1992) Solving the ldquorealrdquo mysteries of visual percep-tion The world as an outside memory Canadian Journal of Psy-chology 46 461-488

OrsquoRegan J K amp Levy-Schoen A (1983) Integrating visual infor-mation from successive fixations Does trans-saccadic fusion existVision Research 23 765-768

Pan K amp Eriksen C W (1993) Attentional distribution in the visualfield during samendashdifferent judgments as assessed by response com-petition Perception amp Psychophysics 53 134-144

Pashler H (1988) Familiarity and visual change detection Percep-tion amp Psychophysics 44 369-378

Phillips W A (1974) On the distinction between sensory storage andshort-term visual memory Perception amp Psychophysics 16 283-290

Pollatsek A Rayner K amp Henderson J (1990) The role of spa-tial location in the integration of pictorial information across sac-cades Journal of Experimental Psychology Human Perception ampPerformance 16 199-210

Pringle H L Irwin D E Kramer A F amp Atchley P (2001)The role of attentional breadth in perceptual change detection Psy-chonomic Bulletin amp Review 8 89-95

Pylyshyn Z W amp Storm R W (1988) Tracking multiple indepen-dent targets Evidence for a parallel tracking mechanism Spatial Vi-sion 3 179-197

Rayner K McConkie G W amp Ehrlich S (1978) Eye movement sand integrating information across f ixations Journal of Experimen-tal Psychology Human Perception amp Performance 4 529-544

Rayner K McConkie G W amp Zola D (1980) Integrating infor-mation across eye movements Cognitive Psychology 12 206-226

Rayner K amp Pollatsek A (1983) Is visual information integratedacross saccades Perception amp Psychophysics 34 39-48

Rensink R A (2000) The dynamic representation of scenes VisualCognition 7 17-42

Rensink R A OrsquoRegan J K amp Clark J J (1997) To see or not tosee The need for attention to perceive changes in scenes Psycho-logical Science 8 368-373

Shepherd M Findlay J amp Hockey R (1986) The relationship be-tween eye movements and spatial attention Quarterly Journal of Ex-perimental Psychology 38A 475-491

Simons D J (1996) In sight out of mind When object representa-tions fail Psychological Science 7 301-305

Simons D J (2000) Current approaches to change blindness VisualCognition 7 1-15

Simons D J amp Levin D T (1997) Change blindness Trends in Cog-nitive Sciences 1 261-267

Simons D J amp Levin D T (1998) Failure to detect changes to peo-ple during a real-world interaction Psychonomic Bulletin amp Review5 644-649

Sperling G (1960) The information available in brief visual presen-tations Psychological Monographs 74(11 Whole No 498)

Van Dijk T amp Kintsch W (1983) Strategies of discourse compre-hension New York Academic Press

Wallis G amp Bulthoff H (2000) Whatrsquos scene and not seen Influ-ences of movement and task upon what we see Visual Cognition 7175-190

Williams P amp Simons D J (2000) Detecting changes in novel 3Dobjects Effects of change magnitude spatiotemporal continuity andstimulus familiarity Visual Cognition 7 297-322

Zelinsky G J (1999) Precuing target location in a variable set sizeldquononsearchrdquo task Dissociating search-based and interference-basedexplanations for set size effects Journal of Experimental Psychol-ogy Human Perception amp Performance 25 875-903

Zelinsky G J (2001) Eye movements during change detection Im-

EYE MOVEMENTS AND MEMORY 895

plications for search constraints memory limitations and scanningstrategies Perception amp Psychophysics 63 209-225

Zelinsky G J amp Loschky L (1998) Toward a realistic assessmentof visual working memory Investigative Ophthalmology amp VisualScience 39 S224

Zelinsky G J Rao R Hayhoe M amp Ballard D (1997) Eyemovements reveal the spatiotemporal dynamics of visual search Psy-chological Science 8 448-453

NOTES

1 The percentage of f ixations summed across positions in Figure 6does not add up to 100 because some fixations did not fall on any ob-ject position This happened relatively rarely for Fixations 3ndash15 rang-ing from 3 to 9 of all the fixations It happened quite often for Fix-ation 2 (the first real f ixation on the scene since Fixation 1 was on thefixation point) accounting for 45 of all the fixations Approximately

67 of these fell near an object position whereas the remainder werescattered randomly in the central area of the scene these may have beenldquocenter of massrdquo fixations like those previously observed by ZelinskyRao Hayhoe and Ballard (1997)

2 In order to determine how many objects would have to be stored inmemory to produce an accuracy level of 80 we first to correct forguessing (Busey amp Loftus 1994) applied the formula p 5 (x 2 g)(1 2g) where x is the raw proportion correct g is the guessing probabilityand p is the corrected proportion correct (note that g 5 143 or 17 be-cause there were always seven objects in the array) We then multipliedp by the number of objects in the display in order to estimate the num-ber of objects remembered (Sperling 1960)

(Manuscript received March 9 2001revision accepted for publication November 28 2001)

888 IRWIN AND ZELINSKY

1981) Figure 5 shows accuracy as a function of whichposition in the scene was probed for each fixation con-dition Position 1 refers to the leftmost object in thescene Position 7 to the rightmost and Position 4 to theobject directly above the fixation point Accuracy in theone-fixation condition showed a U-shaped function withposition so that objects at the sides of the display wereremembered better than objects at other array positionsThere are probably several factors contributing to this re-sult There is less lateral masking for the two end itemsprevious researchers have found that both retinal acuityand visual attention gradients are not circular but ratherare elongated in the horizontal dimension (eg Pan ampEriksen 1993) and subjects may have shifted their at-tention preferentially to the end items as part of a high-level scanning strategy

Of more interest is what happened to the accuracy 3probe position function when more fixations were madeon the display Recall for example that overall accuracyincreased as the number of fixations increased from 1 to3 fixations on the scene Figure 5 shows that not all arraypositions benefited equally however Rather accuracy forthe center-left items in the display increased more than ac-curacy for the center-right items in the display Increasingthe number of fixations from 3 to 5 had similar effects Incontrast accuracy for the rightmost items in the sceneincreased whereas accuracy for the leftmost items in thescene decreased when the number of fixations on thescene increased from 5 to 9 and from 9 to 15 In generalFigure 5 shows that peak accuracy for positions in thearray occurs early for positions on the left side of thearray and late for positions on the right side of the array

It seemed possible that the changes in accuracy acrossprobe position with increasing number of fixations onthe scene might be due to the subjectsrsquo viewing strategies

Figure 6 shows the percentage of fixations on different ob-ject positions in the scene as a function of fixation num-ber (eg 33 of all second fixations fell on Position 13 of all second fixations fell on Position 2 and so on)1In general the pattern of fixations looks very similar tothe pattern of accuracies shown in Figure 5 The subjectstended to fixate the leftmost items in the array early inscene viewing but shifted to rightward positions later inscene viewing There was a great deal of variability be-tween and even within subjects in terms of their viewingstrategies however Two subjects (1 in Version 1 of theexperiment and 1 in Version 2) fixated the objects fromleft to right on most trials 1 subject fixated the objectsfrom right to left on most trials 1 subject moved from leftto right on half the trials and right to left on the other half1 usually fixated Position 1 then Position 4 then Posi-tion 7 skipping over the other positions 1 usually fixatedPosition 4 first and then moved left 1 usually fixated Po-sition 4 first and then moved right and the remainderused a mixture of these strategies changing from one toanother during the course of the experiment There waslittle difference in accuracy between the 3 subjects whoconsistently used a left-to-right or a right-to-left strategy(591) and the 9 who did not (628)

The similarity between the pattern of accuracies in Fig-ure 5 and the pattern of eye fixations in Figure 6 suggeststhat accuracy was affected by whether (and perhaps when)the probed item was foveated during scene presentationThis was examined directly in the next two analyses

Accuracy for Foveated Versus NonfoveatedObjects

On some trials the subjects foveated the object thathappened to be probed whereas on other trials they didnot Figure 7 shows accuracy for foveated versus non-

Position

Per

cent

age

of F

ixat

ions

40

35

30

25

20

15

10

5

01 2 3 4 5 6 7

Fix 2

Fix 3

Fix 4

Fix 5

Fix 6

Fix 7

Fix 8

Fix 9

Fix 10

Fix 11

Fix 12

Fix 13

Fix 14

Fix 15

Figure 6 The percentage of fixations on each object position as a function of fixation number

EYE MOVEMENTS AND MEMORY 889

foveated targets as a function of the number of fixationson the scene No targets were foveated in the 1-fixationcondition because the scene disappeared during the firstsaccade In the 3- 5- and 9-fixation conditions how-ever accuracy was much higher if the position that wasprobed had been foveated as compared with when it hadnot been foveated during scene presentation [929 vs530 in the 3-fixation case t(11) 5 76 p 001842 vs 565 in the 5-fixation case t(5) 5 26 p 05 858 vs 559 in the 9-fixation case t(5) 5 53p 005] The benefit of foveation (808 vs 744)was not significant when 15 fixations had been made onthe scene however [t(5) 5 10 p 35] An inspectionof Figure 7 suggests that this occurred for two reasonsFirst accuracy for nonfoveated objects increased sub-

stantially (from 530 to 744) as the number of fixa-tions on the scene increased from 3 to 15 presumablybecause with an increasing number of f ixations theeyes were landing near the nonfoveated objects even ifthey were not landing on them second accuracy forfoveated objects decreased as the number of fixations in-creased This suggests that recency of fixation on the tar-get object might also be important (ie the 15-fixationcondition includes fixations on the target further back intime than does the 3-fixation condition) This was ex-amined in the next analysis

Accuracy as a Function of Fixation RecencyFigure 8 shows accuracy for foveated targets only as a

function of when during the trial the object had been

Number of Fixations

Per

cent

age

Cor

rect

100

90

80

70

60

50

40

30

20

10

0

1 3 5 9 15

FoveatedNonfoveated

Figure 7 Mean accuracy for foveated versus nonfoveated targets as a function of thenumber of fixations on the scene (an error bar represents the standard error of themean)

Recency of Fixation

Per

cent

age

Cor

rect

100

90

80

70

60

50

40

30

20

10

00 ndash1 ndash2 ndash3 ndash 4 ndash 5

Figure 8 Mean accuracy for foveated targets as a function only of when during the trialthe target had been foveated (an error bar represents the standard error of the mean)

890 IRWIN AND ZELINSKY

foveated If the object that had been foveated just beforethe critical saccade (ie the saccade that terminated thescene) happened to be probed for report (denoted as fix-ation recency 0 in the figure) the subjects reported itcorrectly on over 90 of the trials If the object probedfor report had been foveated 1 or 2 fixations earlier (cor-responding to fixation recencies 21 and 22 in the fig-ure) accuracy was slightly lower (approximately 80)Probing an object that had been foveated 3 or more fix-ations before the critical saccade resulted in accuraciesthat were considerably lower (approximately 65) andnot much different from the accuracy obtained when anonfoveated object had been probed for report (59)The effect of fixation recency (using six levels of re-cency) was significant [F(525) 5 285 MSe 5 2432p 05] in a one-way repeated measures ANOVA thatwas conducted on the data from Version 2 of the experi-ment (this version allowed 3 9 or 15 fixations on thedisplay and thus provided a wide range of fixation re-cencies) A planned comparison based on the error termof the ANOVA showed that differences of 145 weresignificant at the 05 level thus accuracy for the threeobjects foveated just before the critical saccade (iewith fixation recencies of 0 21 and 22) was signifi-cantly higher than accuracy for objects foveated 3 ormore fixations back These results suggest that fixationrecency does influence accuracy in particular it appearsthat people remember the last three objects foveated inthe scene much better than they remember objectsfoveated earlier in time (see Zelinsky amp Loschky 1998for similar results) There was no evidence of a primacyeffect if the object probed for report was the first objectfoveated (and had been foveated 3 or more fixations be-fore the critical saccade) accuracy was 66

Accuracy as a Function of the Number ofObjects Foveated

Given the importance of foveation to scene memorythat is revealed by the preceding analyses one might ex-pect that overall accuracy would depend on the numberof unique objects that were foveated in the scene (recallthat 7 objects appeared in the scene) This relationship isshown in Figure 9 Accuracy increased as 0ndash6 objectswere foveated but there was no additional increase in ac-curacy from 6 to 7 objects Accuracy peaked at 80 evenwhen 6 or 7 unique objects were foveated An accuracylevel of 80 corresponds to memory for 54 objects inthe scene2 This result suggests that scene representa-tions contain no more than about f ive object positionpairings regardless of the number of f ixations on thescene This is not to say that 5 discrete objects in all theirfull detail are stored on every trial however rather thisestimate represents the average amount of informationstored from what is most likely a stochastic processThus it is probably most accurate to say that approxi-mately 5 objectsrsquo ldquoworthrdquo of information is stored in theon-line scene representation

Accuracy When Final Saccade Was Directed atthe Target Versus Directed Elsewhere

Previous research has shown that movements of atten-tion precede movements of the eyes to locations in spacethereby facilitating the processing of information that ispresent at the saccade target location (eg Deubel ampSchneider 1996 Henderson 1993 Henderson Pollat-sek amp Rayner 1989 Hoffman amp Subramaniam 1995Irwin amp Gordon 1998 Klein 1980 Klein amp Pontefract1994 Kowler Anderson Dosher amp Blaser 1995Rayner McConkie amp Ehrlich 1978 Shepherd Findlay

0 1 2 3 4 5 6 7

100

90

80

70

60

50

40

30

20

10

0

Number of Objects Foveated

Per

cent

age

Cor

rect

Figure 9 Mean accuracy as a function of the number of unique objects that were foveatedduring scene presentation (an error bar represents the standard error of the mean)

EYE MOVEMENTS AND MEMORY 891

amp Hockey 1986) Thus we expected that accuracywould be higher when the critical saccade was directedat the object that happened to be probed on any giventrial as compared with when the critical saccade was di-rected toward an unprobed object To elaborate considera one-fixation trial in which the baby bottle was pre-sented in the leftmost position of the display Supposethat the subject moved hisher eyes from the central fix-ation box toward this position Because this was a one-fixation trial the scene disappeared as soon as the sac-cade was initiated to the bottle so nothing was presenton the screen when the eyes landed A short time laterthe probe was presented at one of the display positionsand on some trials it just happened to be presented at thelocation to which the eyes were sent on the critical sac-cade the question of interest is whether accuracy washigher when the probed item happened to be the itemthat the eyes were sent to on the critical saccade thanwhen it was not

Figure 10 shows accuracy for trials in which theprobed item was or was not the target of the critical sac-cade in each fixation condition (by target of the saccadewe mean that the eyes landed within the virtual 24ordmsquare bounding box that defined object position) Onlytrials in which the probed item was not fixated at anytime during scene presentation are included In everyfixation condition accuracy was much higher when theprobed item was the target of the final saccade than whenit was not [778 vs 309 in the 1-f ixation caset(5) 5 26 p 05 929 vs 515 in the 3-fixationcase t(11) 5 60 p 001 879 vs 513 in the 5-fixation case t(5) 5 36 p 02 998 vs 658 in the9-fixation case t(5) 5 102 p 001 and 983 vs751 in the 15-fixation case t(5) 5 49 p 005]

In sum accuracy at reporting the probed item wasvery high if the probe happened to be presented at the lo-cation to which the eyes had been sent on the critical sac-cade even though the object at that location had neverbeen foveated during the trial merely being the target ofthe final saccade increased its memorability consider-ably (see Henderson amp Hollingworth 1999b for similarfindings) This presumably occurred because attentionpreceded the eyes to the saccade target location The ef-fect of attention is to increase the likelihood that infor-mation at the saccade target location will be encodedinto onersquos mental representation of the scene In fact itis possible that many of the beneficial effects of foveat-ing an object discussed above are not due to foveationper se but rather to the attention shift that preceded it

DISCUSSION

In the last 20 years there has been an explosion of in-terest in the properties of on-line scene representation(eg Aginsky amp Tarr 2000 Blackmore et al 1995De Graef 1992 Grimes 1996 Hayhoe 2000 Hender-son amp Hollingworth 1999a 1999b Hollingworth ampHenderson 2000 2002 Hollingworth et al 2001 In-traub 1997 Irwin 1991 1992 Irwin amp Andrews 1996Irwin et al 1988 Irwin et al 1983 Irwin Zacks ampBrown 1990 Levin amp Simons 1997 McConkie amp Cur-rie 1996 Mondy amp Coltheart 2000 OrsquoRegan 1992OrsquoRegan amp Levy-Schoen 1983 Pringle Irwin Krameramp Atchley 2001 Rayner amp Pollatsek 1983 Rensink2000 Rensink et al 1997 Simons 1996 2000 Simonsamp Levin 1998 Wallis amp Bulthoff 2000) One commonfinding is that scene representations appear to be sparseand undetailed Consistent with previous research the

1 3 5 9

100

90

80

70

60

50

40

30

20

10

0

Number of Fixations

Per

cent

age

Cor

rect

15

DirectedNot Directed

Figure 10 Mean accuracy when the final saccade was directed at the target versusnot directed at the target as a function of the number of fixations on the scene (anerror bar represents the standard error of the mean)

892 IRWIN AND ZELINSKY

results of our experiment also indicate that scene repre-sentations are relatively sparse even after 15 fixationson a scene our subjects remembered the positionidentitypairings of only about 78 of the objects in our seven-object displays or the equivalent of about five objectsrsquoworth of information But more important our resultsprovide new information about what factors influencethe contents of the on-line scene representation and howthis representation changes during the course of sceneviewing In particular our analyses suggest that the lastthree items that are foveated and the item about to befoveated are actually remembered quite well much bet-ter than other items in a scene In other words on-linescene representations are dynamic so that the memorytraces of recently fixated objects are much stronger thanthose of objects viewed several fixations back in timeInformation about a scene does appear to accumulateover multiple fixations but the capacity of the on-linescene representation appears to be limited to about fiveitems In most respects the results of the present studyare consistent with the results of other studies that haveused the transsaccadic partial-report technique to inves-tigate integration across saccades with nonscene dis-plays (eg Irwin 1992 Irwin amp Andrews 1996 Irwinamp Gordon 1998)

What is the nature of the memory representation under-lying performance in our task Given that the same sevenobjects appeared in fixed positions on each trial onemight imagine that subjects would simply adopt a left-to-right reading strategy and would store an ordered list ofobject names in verbal working memory Then when theprobe was presented they would scan their memorizedlist and report the appropriate item Three aspects of ourdata seem inconsistent with this hypothesis howeverFirst as was noted earlier for the most part the subjectsdid not use a left-to-right reading strategy nor did theyeven adopt any consistent strategy across trials Thisvariability would greatly complicate memory retrieval ifthe subjects had stored an ordered list of object names inverbal working memory because there would be no con-sistent mapping between probe position and the positionof an item in verbal working memory That is if thememory probe appeared at Position 6 say this wouldcorrespond to the sixth position in the list if the subjectscanned the array from left to right the second positionin the list if the subject scanned from right to left thethird position if the subject started in the middle and thenmoved right and so on The fact that the subjects did notadopt a fixed scanning order suggests to us that theywere not storing an ordered list of object names in ver-bal working memory Second the fact that the subjectsremembered only a maximum of five objects from thedisplay also seems inconsistent with the verbal workingmemory hypothesis because the capacity of verbalworking memory is generally assumed to be betweenfive and nine items given 5 sec to study the display wewould think that the subjects would be able to remember

more than five items if they were using verbal workingmemory Finally the absence of a primacy effect in ourdata also seems inconsistent with the verbal workingmemory hypothesis

Instead we believe that our results are more consistentwith the object-f ile theory of transsaccadic memory(Irwin 1992 1996 Irwin amp Andrews 1996) This theoryis based on the theoretical framework for object percep-tion proposed by Kahneman and Treisman (1984) andKahneman Treisman and Gibbs (1992) It contains fourlevels of representation feature maps which register in-dependently the presence of different sensory features ina scene such as color and shape a master map of loca-tions which registers where in a scene features are lo-cated temporary object representations called objectfiles which are formed when features are conjoined intounitary wholes via attention and which thus representwhat objects are located where in a scene and an abstractlong-term recognition network that stores descriptions ofobjects along with their names Object files may containvisual verbal and semantic information about objectsbecause they have access to long-term memory as well asto perceptual input According to the object-file theoryof transsaccadic memory relatively little information ispreserved across eye movements Rather transsaccadicmemory (and by extension the on-line scene representa-tion) consists of 3mdash5 object files that contain positionand identity information for items that have been at-tended in the scene and of residual activation in the fea-ture maps and in the long-term recognition network Theresults of the present study are consistent with the object-file theory because even after 15 fixations on a scenememory for the scene was limited and consisted largelyof the last few objects that had been attended in the scene

Although the results of the present study are consis-tent with the object-file theory they seem not entirelyconsistent with another theory of scene representationproposed recently by Rensink (2000) According toRensinkrsquos coherence theory in the absence of focusedattention volatile low-level proto-objects are continuallybeing formed and replaced rapidly and in parallel acrossthe visual field reflecting whatever is present on theretina at any moment in time Focused attention providesa coherence field that selects a small number of theseproto-objects to form a stable individuated object thathas spatiotemporal continuity This stable object doesnot necessarily correspond to a single element in the vi-sual field however but may represent a supra-object (cfPylyshynrsquos FINSTs Pylyshyn amp Storm 1988) After fo-cused attention is released the object loses its coherenceand dissolves back into its constituent proto-object partsOnce that occurs there is little or no consequence of hav-ing been attended According to coherence theory visualshort-term memory corresponds to the coherence fieldit contains object tokens (ie specific instantiations at aparticular place and time) that are currently in the focusof attention but that disappear when attention is with-

EYE MOVEMENTS AND MEMORY 893

drawn The coherence theory does allow for the exis-tence of a nonvisual short-term memory for previouslyattended items but it posits that this memory containsinformation only about object types (general definitionsabstracted away from specific spatiotemporal contexts)

The results of the present study seem to raise problemsfor the coherence theory because object tokens were re-tained in memory for recently fixated objects that wereno longer in the focus of attention Consistent with co-herence theory we found that accuracy was very highwhen the object at the final saccade target location(hence the object at the focus of attention) was probedfor report However we also found that accuracy wasquite high for the three objects that had been fixated justprior to this Since attention movements precede eyemovements in an obligatory fashion the three objectsviewed in prior fixations could not also be at the focus ofattention The fact that accuracy was not uniform for thefour objects in question also indicates that they were notall simultaneously at the focus of attention Rather theobjects viewed during the fixations just preceding thefinal saccade benefited from prior attentional process-ing This is inconsistent with the claim that once focusedattention is released objects dissolve back into their con-stituent proto-objects with no consequence of havingbeen attended Note also that object tokens and not ob-ject types were being remembered in our experiment inorder to respond correctly to the partial-report probe thesubjects had to remember the position and the identityof the probed itemmdashthat is they had to remember an ob-ject token Merely remembering that an object type hadappeared somewhere in the display would not lead to anaccurate response especially since the same seven itemsappeared on every trial

Although the present study provides important infor-mation about the factors that influence the contents of on-line scene representations two limitations should benoted One is that our study measured only short-termmemory contributions to the on-line scene representationand not potential long-term memory contributions Thesame seven objects appeared in the same scene context onevery trial so top-down contributions to performancewere essentially eliminated In the real world scene rep-resentations are bound to be influenced by knowledgeabout what kinds of objects are typically found in a par-ticular scene context and where those objects are usuallylocated Indeed recent research by Hollingworth andHenderson (2002 see also Hollingworth et al 2001) inwhich a novel saccade-contingent change detection pro-cedure was used has demonstrated that long-term mem-ory plays a crucial role in the construction and mainte-nance of on-line scene representations

A second limitation of our study is that it involvedonly a test of explicit memory Recently several investi-gators have found evidence that explicit tests of scenememory may underestimate the amount of informationin the representation (eg Fernandez-Duque amp Thornton2000 Hollingworth et al 2001 Williams amp Simons

2000) For example Hollingworth et al found that chang-ing an object in a scene often affected fixation durationon that object even though subjects failed to report thata change had occurred This suggests that at some levelthe perceptual system detected the change even thoughthe subjects were not explicitly aware of it It is impor-tant to note that the implicit effects that have been ob-served have often been small however Furthermoreeven though implicit measures of performance (eg eyemovements) appear to be more sensitive to change thanare explicit measures it does not follow from this thatscene representations contain a great deal more infor-mation than has been measured by explicit tests Explicittests may underestimate the amount of information in thescene representation but there is no evidence that theyunderestimate it by much This is still an open question

REFERENCES

Aginsky V amp Tarr M (2000) How are different properties of a sceneencoded in visual memory Visual Cognition 7 147-162

Averbach E amp Coriell E (1961) Short-term memory in visionBell System Technical Journal 40 309-328

Blackmore S J Brelstaff G Nelson K amp Troscianko T(1995) Is the richness of our visual world an illusion Transsaccadicmemory for complex scenes Perception 24 1075-1081

Bridgeman B Hendry D amp Stark L (1975) Failure to detect dis-placement of the visual world during saccadic eye movements Vi-sion Research 15 719-722

Bridgeman B amp Mayer M (1983) Failure to integrate visual infor-mation from successive fixations Bulletin of the Psychonomic Soci-ety 21 285-286

Bridgeman B van der Heijden A amp Velichkovsky B (1994) Atheory of visual stability across saccadic eye movements Behavioralamp Brain Sciences 17 247-292

Busey T amp Loftus G (1994) Sensory and cognitive components ofvisual information acquisition Psychological Review 101 446-469

Currie C B McConkie G W Carlson-Radvansky L A ampIrwin D E (2000) The role of the saccade target object in the per-ception of a visually stable world Perception amp Psychophysics 62673-683

De Graef P (1992) Scene-context effects and models of real-worldperception In K Rayner (Ed) Eye movements and visual cognitionScene perception and reading (pp 243-259) New York Springer-Verlag

Deubel H amp Schneider W X (1996) Saccade target selection andobject recognition Evidence for a common attentional mechanismVision Research 36 1827-1837

Fernandez-Duque D amp Thornton I M (2000) Change detectionwithout awareness Do explicit reports underestimate the representa-tion of change in the visual system Visual Cognition 7 324-344

Grimes J (1996) On the failure to detect changes in scenes across sac-cades In K Akins (Ed) Perception (Vancouver Studies in CognitiveScience Vol 5 pp 89-110) New York Oxford University Press

Hayhoe M M (2000) Vision using routines A functional account ofvision Visual Cognition 7 43-64

Henderson J M (1993) Visual attention and saccadic eye move-ments In G drsquoYdewalle amp J Van Rensbergen (Eds) Perception andcognition Advances in eye-movement research (pp 37-50) Amster-dam North-Holland

Henderson J M (1997) Transsaccadic memory and integration dur-ing real-world object perception Psychological Science 8 51-55

Henderson J M amp Hollingworth A (1999a) High-level sceneperception Annual Review of Psychology 50 243-271

Henderson J M amp Hollingworth A (1999b) The role of fixationposition in detecting scene changes across saccades PsychologicalScience 10 438-443

894 IRWIN AND ZELINSKY

Henderson J M Pollatsek A amp Rayner K (1989) Covert visualattention and extrafoveal information use during object identifica-tion Perception amp Psychophysics 45 196-208

Hochberg J (1986) Representation of motion and space in video andcinematic displays In K R Boff L Kaufman amp J P Thomas (Eds)Handbook of perception and human performance Vol 1 Sensoryprocesses and perception (pp 221-2264) New York Wiley

Hoffman J E amp Subramaniam B (1995) The role of visual atten-tion in saccadic eye movements Perception amp Psychophysics 57787-795

Hollingworth A amp Henderson J M (2000) Semantic informa-tiveness mediates the detection of changes in natural scenes VisualCognition 7 213-235

Hollingworth A amp Henderson J M (2002) Accurate visualmemory for previously attended objects in natural scenes Journal ofExperimental Psychology Human Perception amp Performance 28113-136

Hollingworth A Williams C C amp Henderson J M (2001) Tosee and remember Visually specific information is retained in mem-ory from previously attended objects in natural scenes PsychonomicBulletin amp Review 8 761-768

Intraub H (1997) The representation of visual scenes Trends in Cog-nitive Sciences 1 217-222

Irwin D E (1991) Information integration across saccadic eye move-ments Cognitive Psychology 23 420-456

Irwin D E (1992) Memory for position and identity across eye move-ments Journal of Experimental Psychology Learning Memory ampCognition 18 307-317

Irwin D E (1996) Integrating information across saccadic eye move-ments Current Directions in Psychological Science 5 94-100

Irwin D E amp Andrews R V (1996) Integration and accumulationof information across saccadic eye movements In T Inui amp J L Mc-Clelland (Eds) Attention and performance XVI Information inte-gration in perception and communication (pp 125-155) CambridgeMA MIT Press

Irwin D E Brown J S amp Sun J-S (1988) Visual masking and vi-sual integration across saccadic eye movements Journal of Experi-mental Psychology General 117 276-287

Irwin D E amp Gordon R D (1998) Eye movements attention andtranssaccadic memory Visual Cognition 5 127-155

Irwin D E Yantis S amp Jonides J (1983) Evidence against visualintegration across saccadic eye movements Perception amp Psycho-physics 34 49-57

Irwin D E Zacks J L amp Brown J S (1990) Visual memory andthe perception of a stable visual environment Perception amp Psycho-physics 47 35-46

Kahneman D amp Treisman A (1984) Changing views of attentionand automaticity In R Parasuraman amp D R Davies (Eds) Varietiesof attention (pp 29-61) Orlando FL Academic Press

Kahneman D Treisman A amp Gibbs B J (1992) The reviewing ofobject f iles Object-specific integration of information CognitivePsychology 24 175-219

Klein R (1980) Does oculomotor readiness mediate cognitive controlof visual attention In R S Nickerson (Ed) Attention and perfor-mance VIII (pp 259-276) Hillsdale NJ Erlbaum

Klein R amp Pontefract A (1994) Does oculomotor readiness me-diate cognitive control of visual attention Revisited In C Umiltagrave ampM Moskovitch (Eds) Attention and performance XV Consciousand nonconscious information processing (pp 333-350) CambridgeMA MIT Press Bradford Books

Kowler E Anderson E Dosher B amp Blaser E (1995) The roleof attention in the programming of saccades Vision Research 351897-1916

Levin D T amp Simons D J (1997) Failure to detect changes to at-tended objects in motion pictures Psychonomic Bulletin amp Review4 501-506

Matin E (1974) Saccadic suppression A review and an analysis Psy-chological Bulletin 81 899-917

McConkie G W amp Currie C B (1996) Visual stability across sac-cades while viewing complex pictures Journal of Experimental Psy-chology Human Perception amp Performance 22 563-581

McConkie G W amp Zola D (1979) Is visual information integratedacross successive fixations in reading Perception amp Psychophysics25 221-224

Mewhort D J K Campbell A J Marchetti F M amp CampbellJ I D (1981) Identif ication localization and ldquoiconic memoryrdquo Anevaluation of the bar-probe task Memory amp Cognition 9 50-67

Mondy S amp Coltheart V (2000) Detection and identification ofchange in naturalistic scenes Visual Cognition 7 281-296

OrsquoRegan J K (1992) Solving the ldquorealrdquo mysteries of visual percep-tion The world as an outside memory Canadian Journal of Psy-chology 46 461-488

OrsquoRegan J K amp Levy-Schoen A (1983) Integrating visual infor-mation from successive fixations Does trans-saccadic fusion existVision Research 23 765-768

Pan K amp Eriksen C W (1993) Attentional distribution in the visualfield during samendashdifferent judgments as assessed by response com-petition Perception amp Psychophysics 53 134-144

Pashler H (1988) Familiarity and visual change detection Percep-tion amp Psychophysics 44 369-378

Phillips W A (1974) On the distinction between sensory storage andshort-term visual memory Perception amp Psychophysics 16 283-290

Pollatsek A Rayner K amp Henderson J (1990) The role of spa-tial location in the integration of pictorial information across sac-cades Journal of Experimental Psychology Human Perception ampPerformance 16 199-210

Pringle H L Irwin D E Kramer A F amp Atchley P (2001)The role of attentional breadth in perceptual change detection Psy-chonomic Bulletin amp Review 8 89-95

Pylyshyn Z W amp Storm R W (1988) Tracking multiple indepen-dent targets Evidence for a parallel tracking mechanism Spatial Vi-sion 3 179-197

Rayner K McConkie G W amp Ehrlich S (1978) Eye movement sand integrating information across f ixations Journal of Experimen-tal Psychology Human Perception amp Performance 4 529-544

Rayner K McConkie G W amp Zola D (1980) Integrating infor-mation across eye movements Cognitive Psychology 12 206-226

Rayner K amp Pollatsek A (1983) Is visual information integratedacross saccades Perception amp Psychophysics 34 39-48

Rensink R A (2000) The dynamic representation of scenes VisualCognition 7 17-42

Rensink R A OrsquoRegan J K amp Clark J J (1997) To see or not tosee The need for attention to perceive changes in scenes Psycho-logical Science 8 368-373

Shepherd M Findlay J amp Hockey R (1986) The relationship be-tween eye movements and spatial attention Quarterly Journal of Ex-perimental Psychology 38A 475-491

Simons D J (1996) In sight out of mind When object representa-tions fail Psychological Science 7 301-305

Simons D J (2000) Current approaches to change blindness VisualCognition 7 1-15

Simons D J amp Levin D T (1997) Change blindness Trends in Cog-nitive Sciences 1 261-267

Simons D J amp Levin D T (1998) Failure to detect changes to peo-ple during a real-world interaction Psychonomic Bulletin amp Review5 644-649

Sperling G (1960) The information available in brief visual presen-tations Psychological Monographs 74(11 Whole No 498)

Van Dijk T amp Kintsch W (1983) Strategies of discourse compre-hension New York Academic Press

Wallis G amp Bulthoff H (2000) Whatrsquos scene and not seen Influ-ences of movement and task upon what we see Visual Cognition 7175-190

Williams P amp Simons D J (2000) Detecting changes in novel 3Dobjects Effects of change magnitude spatiotemporal continuity andstimulus familiarity Visual Cognition 7 297-322

Zelinsky G J (1999) Precuing target location in a variable set sizeldquononsearchrdquo task Dissociating search-based and interference-basedexplanations for set size effects Journal of Experimental Psychol-ogy Human Perception amp Performance 25 875-903

Zelinsky G J (2001) Eye movements during change detection Im-

EYE MOVEMENTS AND MEMORY 895

plications for search constraints memory limitations and scanningstrategies Perception amp Psychophysics 63 209-225

Zelinsky G J amp Loschky L (1998) Toward a realistic assessmentof visual working memory Investigative Ophthalmology amp VisualScience 39 S224

Zelinsky G J Rao R Hayhoe M amp Ballard D (1997) Eyemovements reveal the spatiotemporal dynamics of visual search Psy-chological Science 8 448-453

NOTES

1 The percentage of f ixations summed across positions in Figure 6does not add up to 100 because some fixations did not fall on any ob-ject position This happened relatively rarely for Fixations 3ndash15 rang-ing from 3 to 9 of all the fixations It happened quite often for Fix-ation 2 (the first real f ixation on the scene since Fixation 1 was on thefixation point) accounting for 45 of all the fixations Approximately

67 of these fell near an object position whereas the remainder werescattered randomly in the central area of the scene these may have beenldquocenter of massrdquo fixations like those previously observed by ZelinskyRao Hayhoe and Ballard (1997)

2 In order to determine how many objects would have to be stored inmemory to produce an accuracy level of 80 we first to correct forguessing (Busey amp Loftus 1994) applied the formula p 5 (x 2 g)(1 2g) where x is the raw proportion correct g is the guessing probabilityand p is the corrected proportion correct (note that g 5 143 or 17 be-cause there were always seven objects in the array) We then multipliedp by the number of objects in the display in order to estimate the num-ber of objects remembered (Sperling 1960)

(Manuscript received March 9 2001revision accepted for publication November 28 2001)

EYE MOVEMENTS AND MEMORY 889

foveated targets as a function of the number of fixationson the scene No targets were foveated in the 1-fixationcondition because the scene disappeared during the firstsaccade In the 3- 5- and 9-fixation conditions how-ever accuracy was much higher if the position that wasprobed had been foveated as compared with when it hadnot been foveated during scene presentation [929 vs530 in the 3-fixation case t(11) 5 76 p 001842 vs 565 in the 5-fixation case t(5) 5 26 p 05 858 vs 559 in the 9-fixation case t(5) 5 53p 005] The benefit of foveation (808 vs 744)was not significant when 15 fixations had been made onthe scene however [t(5) 5 10 p 35] An inspectionof Figure 7 suggests that this occurred for two reasonsFirst accuracy for nonfoveated objects increased sub-

stantially (from 530 to 744) as the number of fixa-tions on the scene increased from 3 to 15 presumablybecause with an increasing number of f ixations theeyes were landing near the nonfoveated objects even ifthey were not landing on them second accuracy forfoveated objects decreased as the number of fixations in-creased This suggests that recency of fixation on the tar-get object might also be important (ie the 15-fixationcondition includes fixations on the target further back intime than does the 3-fixation condition) This was ex-amined in the next analysis

Accuracy as a Function of Fixation RecencyFigure 8 shows accuracy for foveated targets only as a

function of when during the trial the object had been

Number of Fixations

Per

cent

age

Cor

rect

100

90

80

70

60

50

40

30

20

10

0

1 3 5 9 15

FoveatedNonfoveated

Figure 7 Mean accuracy for foveated versus nonfoveated targets as a function of thenumber of fixations on the scene (an error bar represents the standard error of themean)

Recency of Fixation

Per

cent

age

Cor

rect

100

90

80

70

60

50

40

30

20

10

00 ndash1 ndash2 ndash3 ndash 4 ndash 5

Figure 8 Mean accuracy for foveated targets as a function only of when during the trialthe target had been foveated (an error bar represents the standard error of the mean)

890 IRWIN AND ZELINSKY

foveated If the object that had been foveated just beforethe critical saccade (ie the saccade that terminated thescene) happened to be probed for report (denoted as fix-ation recency 0 in the figure) the subjects reported itcorrectly on over 90 of the trials If the object probedfor report had been foveated 1 or 2 fixations earlier (cor-responding to fixation recencies 21 and 22 in the fig-ure) accuracy was slightly lower (approximately 80)Probing an object that had been foveated 3 or more fix-ations before the critical saccade resulted in accuraciesthat were considerably lower (approximately 65) andnot much different from the accuracy obtained when anonfoveated object had been probed for report (59)The effect of fixation recency (using six levels of re-cency) was significant [F(525) 5 285 MSe 5 2432p 05] in a one-way repeated measures ANOVA thatwas conducted on the data from Version 2 of the experi-ment (this version allowed 3 9 or 15 fixations on thedisplay and thus provided a wide range of fixation re-cencies) A planned comparison based on the error termof the ANOVA showed that differences of 145 weresignificant at the 05 level thus accuracy for the threeobjects foveated just before the critical saccade (iewith fixation recencies of 0 21 and 22) was signifi-cantly higher than accuracy for objects foveated 3 ormore fixations back These results suggest that fixationrecency does influence accuracy in particular it appearsthat people remember the last three objects foveated inthe scene much better than they remember objectsfoveated earlier in time (see Zelinsky amp Loschky 1998for similar results) There was no evidence of a primacyeffect if the object probed for report was the first objectfoveated (and had been foveated 3 or more fixations be-fore the critical saccade) accuracy was 66

Accuracy as a Function of the Number ofObjects Foveated

Given the importance of foveation to scene memorythat is revealed by the preceding analyses one might ex-pect that overall accuracy would depend on the numberof unique objects that were foveated in the scene (recallthat 7 objects appeared in the scene) This relationship isshown in Figure 9 Accuracy increased as 0ndash6 objectswere foveated but there was no additional increase in ac-curacy from 6 to 7 objects Accuracy peaked at 80 evenwhen 6 or 7 unique objects were foveated An accuracylevel of 80 corresponds to memory for 54 objects inthe scene2 This result suggests that scene representa-tions contain no more than about f ive object positionpairings regardless of the number of f ixations on thescene This is not to say that 5 discrete objects in all theirfull detail are stored on every trial however rather thisestimate represents the average amount of informationstored from what is most likely a stochastic processThus it is probably most accurate to say that approxi-mately 5 objectsrsquo ldquoworthrdquo of information is stored in theon-line scene representation

Accuracy When Final Saccade Was Directed atthe Target Versus Directed Elsewhere

Previous research has shown that movements of atten-tion precede movements of the eyes to locations in spacethereby facilitating the processing of information that ispresent at the saccade target location (eg Deubel ampSchneider 1996 Henderson 1993 Henderson Pollat-sek amp Rayner 1989 Hoffman amp Subramaniam 1995Irwin amp Gordon 1998 Klein 1980 Klein amp Pontefract1994 Kowler Anderson Dosher amp Blaser 1995Rayner McConkie amp Ehrlich 1978 Shepherd Findlay

0 1 2 3 4 5 6 7

100

90

80

70

60

50

40

30

20

10

0

Number of Objects Foveated

Per

cent

age

Cor

rect

Figure 9 Mean accuracy as a function of the number of unique objects that were foveatedduring scene presentation (an error bar represents the standard error of the mean)

EYE MOVEMENTS AND MEMORY 891

amp Hockey 1986) Thus we expected that accuracywould be higher when the critical saccade was directedat the object that happened to be probed on any giventrial as compared with when the critical saccade was di-rected toward an unprobed object To elaborate considera one-fixation trial in which the baby bottle was pre-sented in the leftmost position of the display Supposethat the subject moved hisher eyes from the central fix-ation box toward this position Because this was a one-fixation trial the scene disappeared as soon as the sac-cade was initiated to the bottle so nothing was presenton the screen when the eyes landed A short time laterthe probe was presented at one of the display positionsand on some trials it just happened to be presented at thelocation to which the eyes were sent on the critical sac-cade the question of interest is whether accuracy washigher when the probed item happened to be the itemthat the eyes were sent to on the critical saccade thanwhen it was not

Figure 10 shows accuracy for trials in which theprobed item was or was not the target of the critical sac-cade in each fixation condition (by target of the saccadewe mean that the eyes landed within the virtual 24ordmsquare bounding box that defined object position) Onlytrials in which the probed item was not fixated at anytime during scene presentation are included In everyfixation condition accuracy was much higher when theprobed item was the target of the final saccade than whenit was not [778 vs 309 in the 1-f ixation caset(5) 5 26 p 05 929 vs 515 in the 3-fixationcase t(11) 5 60 p 001 879 vs 513 in the 5-fixation case t(5) 5 36 p 02 998 vs 658 in the9-fixation case t(5) 5 102 p 001 and 983 vs751 in the 15-fixation case t(5) 5 49 p 005]

In sum accuracy at reporting the probed item wasvery high if the probe happened to be presented at the lo-cation to which the eyes had been sent on the critical sac-cade even though the object at that location had neverbeen foveated during the trial merely being the target ofthe final saccade increased its memorability consider-ably (see Henderson amp Hollingworth 1999b for similarfindings) This presumably occurred because attentionpreceded the eyes to the saccade target location The ef-fect of attention is to increase the likelihood that infor-mation at the saccade target location will be encodedinto onersquos mental representation of the scene In fact itis possible that many of the beneficial effects of foveat-ing an object discussed above are not due to foveationper se but rather to the attention shift that preceded it

DISCUSSION

In the last 20 years there has been an explosion of in-terest in the properties of on-line scene representation(eg Aginsky amp Tarr 2000 Blackmore et al 1995De Graef 1992 Grimes 1996 Hayhoe 2000 Hender-son amp Hollingworth 1999a 1999b Hollingworth ampHenderson 2000 2002 Hollingworth et al 2001 In-traub 1997 Irwin 1991 1992 Irwin amp Andrews 1996Irwin et al 1988 Irwin et al 1983 Irwin Zacks ampBrown 1990 Levin amp Simons 1997 McConkie amp Cur-rie 1996 Mondy amp Coltheart 2000 OrsquoRegan 1992OrsquoRegan amp Levy-Schoen 1983 Pringle Irwin Krameramp Atchley 2001 Rayner amp Pollatsek 1983 Rensink2000 Rensink et al 1997 Simons 1996 2000 Simonsamp Levin 1998 Wallis amp Bulthoff 2000) One commonfinding is that scene representations appear to be sparseand undetailed Consistent with previous research the

1 3 5 9

100

90

80

70

60

50

40

30

20

10

0

Number of Fixations

Per

cent

age

Cor

rect

15

DirectedNot Directed

Figure 10 Mean accuracy when the final saccade was directed at the target versusnot directed at the target as a function of the number of fixations on the scene (anerror bar represents the standard error of the mean)

892 IRWIN AND ZELINSKY

results of our experiment also indicate that scene repre-sentations are relatively sparse even after 15 fixationson a scene our subjects remembered the positionidentitypairings of only about 78 of the objects in our seven-object displays or the equivalent of about five objectsrsquoworth of information But more important our resultsprovide new information about what factors influencethe contents of the on-line scene representation and howthis representation changes during the course of sceneviewing In particular our analyses suggest that the lastthree items that are foveated and the item about to befoveated are actually remembered quite well much bet-ter than other items in a scene In other words on-linescene representations are dynamic so that the memorytraces of recently fixated objects are much stronger thanthose of objects viewed several fixations back in timeInformation about a scene does appear to accumulateover multiple fixations but the capacity of the on-linescene representation appears to be limited to about fiveitems In most respects the results of the present studyare consistent with the results of other studies that haveused the transsaccadic partial-report technique to inves-tigate integration across saccades with nonscene dis-plays (eg Irwin 1992 Irwin amp Andrews 1996 Irwinamp Gordon 1998)

What is the nature of the memory representation under-lying performance in our task Given that the same sevenobjects appeared in fixed positions on each trial onemight imagine that subjects would simply adopt a left-to-right reading strategy and would store an ordered list ofobject names in verbal working memory Then when theprobe was presented they would scan their memorizedlist and report the appropriate item Three aspects of ourdata seem inconsistent with this hypothesis howeverFirst as was noted earlier for the most part the subjectsdid not use a left-to-right reading strategy nor did theyeven adopt any consistent strategy across trials Thisvariability would greatly complicate memory retrieval ifthe subjects had stored an ordered list of object names inverbal working memory because there would be no con-sistent mapping between probe position and the positionof an item in verbal working memory That is if thememory probe appeared at Position 6 say this wouldcorrespond to the sixth position in the list if the subjectscanned the array from left to right the second positionin the list if the subject scanned from right to left thethird position if the subject started in the middle and thenmoved right and so on The fact that the subjects did notadopt a fixed scanning order suggests to us that theywere not storing an ordered list of object names in ver-bal working memory Second the fact that the subjectsremembered only a maximum of five objects from thedisplay also seems inconsistent with the verbal workingmemory hypothesis because the capacity of verbalworking memory is generally assumed to be betweenfive and nine items given 5 sec to study the display wewould think that the subjects would be able to remember

more than five items if they were using verbal workingmemory Finally the absence of a primacy effect in ourdata also seems inconsistent with the verbal workingmemory hypothesis

Instead we believe that our results are more consistentwith the object-f ile theory of transsaccadic memory(Irwin 1992 1996 Irwin amp Andrews 1996) This theoryis based on the theoretical framework for object percep-tion proposed by Kahneman and Treisman (1984) andKahneman Treisman and Gibbs (1992) It contains fourlevels of representation feature maps which register in-dependently the presence of different sensory features ina scene such as color and shape a master map of loca-tions which registers where in a scene features are lo-cated temporary object representations called objectfiles which are formed when features are conjoined intounitary wholes via attention and which thus representwhat objects are located where in a scene and an abstractlong-term recognition network that stores descriptions ofobjects along with their names Object files may containvisual verbal and semantic information about objectsbecause they have access to long-term memory as well asto perceptual input According to the object-file theoryof transsaccadic memory relatively little information ispreserved across eye movements Rather transsaccadicmemory (and by extension the on-line scene representa-tion) consists of 3mdash5 object files that contain positionand identity information for items that have been at-tended in the scene and of residual activation in the fea-ture maps and in the long-term recognition network Theresults of the present study are consistent with the object-file theory because even after 15 fixations on a scenememory for the scene was limited and consisted largelyof the last few objects that had been attended in the scene

Although the results of the present study are consis-tent with the object-file theory they seem not entirelyconsistent with another theory of scene representationproposed recently by Rensink (2000) According toRensinkrsquos coherence theory in the absence of focusedattention volatile low-level proto-objects are continuallybeing formed and replaced rapidly and in parallel acrossthe visual field reflecting whatever is present on theretina at any moment in time Focused attention providesa coherence field that selects a small number of theseproto-objects to form a stable individuated object thathas spatiotemporal continuity This stable object doesnot necessarily correspond to a single element in the vi-sual field however but may represent a supra-object (cfPylyshynrsquos FINSTs Pylyshyn amp Storm 1988) After fo-cused attention is released the object loses its coherenceand dissolves back into its constituent proto-object partsOnce that occurs there is little or no consequence of hav-ing been attended According to coherence theory visualshort-term memory corresponds to the coherence fieldit contains object tokens (ie specific instantiations at aparticular place and time) that are currently in the focusof attention but that disappear when attention is with-

EYE MOVEMENTS AND MEMORY 893

drawn The coherence theory does allow for the exis-tence of a nonvisual short-term memory for previouslyattended items but it posits that this memory containsinformation only about object types (general definitionsabstracted away from specific spatiotemporal contexts)

The results of the present study seem to raise problemsfor the coherence theory because object tokens were re-tained in memory for recently fixated objects that wereno longer in the focus of attention Consistent with co-herence theory we found that accuracy was very highwhen the object at the final saccade target location(hence the object at the focus of attention) was probedfor report However we also found that accuracy wasquite high for the three objects that had been fixated justprior to this Since attention movements precede eyemovements in an obligatory fashion the three objectsviewed in prior fixations could not also be at the focus ofattention The fact that accuracy was not uniform for thefour objects in question also indicates that they were notall simultaneously at the focus of attention Rather theobjects viewed during the fixations just preceding thefinal saccade benefited from prior attentional process-ing This is inconsistent with the claim that once focusedattention is released objects dissolve back into their con-stituent proto-objects with no consequence of havingbeen attended Note also that object tokens and not ob-ject types were being remembered in our experiment inorder to respond correctly to the partial-report probe thesubjects had to remember the position and the identityof the probed itemmdashthat is they had to remember an ob-ject token Merely remembering that an object type hadappeared somewhere in the display would not lead to anaccurate response especially since the same seven itemsappeared on every trial

Although the present study provides important infor-mation about the factors that influence the contents of on-line scene representations two limitations should benoted One is that our study measured only short-termmemory contributions to the on-line scene representationand not potential long-term memory contributions Thesame seven objects appeared in the same scene context onevery trial so top-down contributions to performancewere essentially eliminated In the real world scene rep-resentations are bound to be influenced by knowledgeabout what kinds of objects are typically found in a par-ticular scene context and where those objects are usuallylocated Indeed recent research by Hollingworth andHenderson (2002 see also Hollingworth et al 2001) inwhich a novel saccade-contingent change detection pro-cedure was used has demonstrated that long-term mem-ory plays a crucial role in the construction and mainte-nance of on-line scene representations

A second limitation of our study is that it involvedonly a test of explicit memory Recently several investi-gators have found evidence that explicit tests of scenememory may underestimate the amount of informationin the representation (eg Fernandez-Duque amp Thornton2000 Hollingworth et al 2001 Williams amp Simons

2000) For example Hollingworth et al found that chang-ing an object in a scene often affected fixation durationon that object even though subjects failed to report thata change had occurred This suggests that at some levelthe perceptual system detected the change even thoughthe subjects were not explicitly aware of it It is impor-tant to note that the implicit effects that have been ob-served have often been small however Furthermoreeven though implicit measures of performance (eg eyemovements) appear to be more sensitive to change thanare explicit measures it does not follow from this thatscene representations contain a great deal more infor-mation than has been measured by explicit tests Explicittests may underestimate the amount of information in thescene representation but there is no evidence that theyunderestimate it by much This is still an open question

REFERENCES

Aginsky V amp Tarr M (2000) How are different properties of a sceneencoded in visual memory Visual Cognition 7 147-162

Averbach E amp Coriell E (1961) Short-term memory in visionBell System Technical Journal 40 309-328

Blackmore S J Brelstaff G Nelson K amp Troscianko T(1995) Is the richness of our visual world an illusion Transsaccadicmemory for complex scenes Perception 24 1075-1081

Bridgeman B Hendry D amp Stark L (1975) Failure to detect dis-placement of the visual world during saccadic eye movements Vi-sion Research 15 719-722

Bridgeman B amp Mayer M (1983) Failure to integrate visual infor-mation from successive fixations Bulletin of the Psychonomic Soci-ety 21 285-286

Bridgeman B van der Heijden A amp Velichkovsky B (1994) Atheory of visual stability across saccadic eye movements Behavioralamp Brain Sciences 17 247-292

Busey T amp Loftus G (1994) Sensory and cognitive components ofvisual information acquisition Psychological Review 101 446-469

Currie C B McConkie G W Carlson-Radvansky L A ampIrwin D E (2000) The role of the saccade target object in the per-ception of a visually stable world Perception amp Psychophysics 62673-683

De Graef P (1992) Scene-context effects and models of real-worldperception In K Rayner (Ed) Eye movements and visual cognitionScene perception and reading (pp 243-259) New York Springer-Verlag

Deubel H amp Schneider W X (1996) Saccade target selection andobject recognition Evidence for a common attentional mechanismVision Research 36 1827-1837

Fernandez-Duque D amp Thornton I M (2000) Change detectionwithout awareness Do explicit reports underestimate the representa-tion of change in the visual system Visual Cognition 7 324-344

Grimes J (1996) On the failure to detect changes in scenes across sac-cades In K Akins (Ed) Perception (Vancouver Studies in CognitiveScience Vol 5 pp 89-110) New York Oxford University Press

Hayhoe M M (2000) Vision using routines A functional account ofvision Visual Cognition 7 43-64

Henderson J M (1993) Visual attention and saccadic eye move-ments In G drsquoYdewalle amp J Van Rensbergen (Eds) Perception andcognition Advances in eye-movement research (pp 37-50) Amster-dam North-Holland

Henderson J M (1997) Transsaccadic memory and integration dur-ing real-world object perception Psychological Science 8 51-55

Henderson J M amp Hollingworth A (1999a) High-level sceneperception Annual Review of Psychology 50 243-271

Henderson J M amp Hollingworth A (1999b) The role of fixationposition in detecting scene changes across saccades PsychologicalScience 10 438-443

894 IRWIN AND ZELINSKY

Henderson J M Pollatsek A amp Rayner K (1989) Covert visualattention and extrafoveal information use during object identifica-tion Perception amp Psychophysics 45 196-208

Hochberg J (1986) Representation of motion and space in video andcinematic displays In K R Boff L Kaufman amp J P Thomas (Eds)Handbook of perception and human performance Vol 1 Sensoryprocesses and perception (pp 221-2264) New York Wiley

Hoffman J E amp Subramaniam B (1995) The role of visual atten-tion in saccadic eye movements Perception amp Psychophysics 57787-795

Hollingworth A amp Henderson J M (2000) Semantic informa-tiveness mediates the detection of changes in natural scenes VisualCognition 7 213-235

Hollingworth A amp Henderson J M (2002) Accurate visualmemory for previously attended objects in natural scenes Journal ofExperimental Psychology Human Perception amp Performance 28113-136

Hollingworth A Williams C C amp Henderson J M (2001) Tosee and remember Visually specific information is retained in mem-ory from previously attended objects in natural scenes PsychonomicBulletin amp Review 8 761-768

Intraub H (1997) The representation of visual scenes Trends in Cog-nitive Sciences 1 217-222

Irwin D E (1991) Information integration across saccadic eye move-ments Cognitive Psychology 23 420-456

Irwin D E (1992) Memory for position and identity across eye move-ments Journal of Experimental Psychology Learning Memory ampCognition 18 307-317

Irwin D E (1996) Integrating information across saccadic eye move-ments Current Directions in Psychological Science 5 94-100

Irwin D E amp Andrews R V (1996) Integration and accumulationof information across saccadic eye movements In T Inui amp J L Mc-Clelland (Eds) Attention and performance XVI Information inte-gration in perception and communication (pp 125-155) CambridgeMA MIT Press

Irwin D E Brown J S amp Sun J-S (1988) Visual masking and vi-sual integration across saccadic eye movements Journal of Experi-mental Psychology General 117 276-287

Irwin D E amp Gordon R D (1998) Eye movements attention andtranssaccadic memory Visual Cognition 5 127-155

Irwin D E Yantis S amp Jonides J (1983) Evidence against visualintegration across saccadic eye movements Perception amp Psycho-physics 34 49-57

Irwin D E Zacks J L amp Brown J S (1990) Visual memory andthe perception of a stable visual environment Perception amp Psycho-physics 47 35-46

Kahneman D amp Treisman A (1984) Changing views of attentionand automaticity In R Parasuraman amp D R Davies (Eds) Varietiesof attention (pp 29-61) Orlando FL Academic Press

Kahneman D Treisman A amp Gibbs B J (1992) The reviewing ofobject f iles Object-specific integration of information CognitivePsychology 24 175-219

Klein R (1980) Does oculomotor readiness mediate cognitive controlof visual attention In R S Nickerson (Ed) Attention and perfor-mance VIII (pp 259-276) Hillsdale NJ Erlbaum

Klein R amp Pontefract A (1994) Does oculomotor readiness me-diate cognitive control of visual attention Revisited In C Umiltagrave ampM Moskovitch (Eds) Attention and performance XV Consciousand nonconscious information processing (pp 333-350) CambridgeMA MIT Press Bradford Books

Kowler E Anderson E Dosher B amp Blaser E (1995) The roleof attention in the programming of saccades Vision Research 351897-1916

Levin D T amp Simons D J (1997) Failure to detect changes to at-tended objects in motion pictures Psychonomic Bulletin amp Review4 501-506

Matin E (1974) Saccadic suppression A review and an analysis Psy-chological Bulletin 81 899-917

McConkie G W amp Currie C B (1996) Visual stability across sac-cades while viewing complex pictures Journal of Experimental Psy-chology Human Perception amp Performance 22 563-581

McConkie G W amp Zola D (1979) Is visual information integratedacross successive fixations in reading Perception amp Psychophysics25 221-224

Mewhort D J K Campbell A J Marchetti F M amp CampbellJ I D (1981) Identif ication localization and ldquoiconic memoryrdquo Anevaluation of the bar-probe task Memory amp Cognition 9 50-67

Mondy S amp Coltheart V (2000) Detection and identification ofchange in naturalistic scenes Visual Cognition 7 281-296

OrsquoRegan J K (1992) Solving the ldquorealrdquo mysteries of visual percep-tion The world as an outside memory Canadian Journal of Psy-chology 46 461-488

OrsquoRegan J K amp Levy-Schoen A (1983) Integrating visual infor-mation from successive fixations Does trans-saccadic fusion existVision Research 23 765-768

Pan K amp Eriksen C W (1993) Attentional distribution in the visualfield during samendashdifferent judgments as assessed by response com-petition Perception amp Psychophysics 53 134-144

Pashler H (1988) Familiarity and visual change detection Percep-tion amp Psychophysics 44 369-378

Phillips W A (1974) On the distinction between sensory storage andshort-term visual memory Perception amp Psychophysics 16 283-290

Pollatsek A Rayner K amp Henderson J (1990) The role of spa-tial location in the integration of pictorial information across sac-cades Journal of Experimental Psychology Human Perception ampPerformance 16 199-210

Pringle H L Irwin D E Kramer A F amp Atchley P (2001)The role of attentional breadth in perceptual change detection Psy-chonomic Bulletin amp Review 8 89-95

Pylyshyn Z W amp Storm R W (1988) Tracking multiple indepen-dent targets Evidence for a parallel tracking mechanism Spatial Vi-sion 3 179-197

Rayner K McConkie G W amp Ehrlich S (1978) Eye movement sand integrating information across f ixations Journal of Experimen-tal Psychology Human Perception amp Performance 4 529-544

Rayner K McConkie G W amp Zola D (1980) Integrating infor-mation across eye movements Cognitive Psychology 12 206-226

Rayner K amp Pollatsek A (1983) Is visual information integratedacross saccades Perception amp Psychophysics 34 39-48

Rensink R A (2000) The dynamic representation of scenes VisualCognition 7 17-42

Rensink R A OrsquoRegan J K amp Clark J J (1997) To see or not tosee The need for attention to perceive changes in scenes Psycho-logical Science 8 368-373

Shepherd M Findlay J amp Hockey R (1986) The relationship be-tween eye movements and spatial attention Quarterly Journal of Ex-perimental Psychology 38A 475-491

Simons D J (1996) In sight out of mind When object representa-tions fail Psychological Science 7 301-305

Simons D J (2000) Current approaches to change blindness VisualCognition 7 1-15

Simons D J amp Levin D T (1997) Change blindness Trends in Cog-nitive Sciences 1 261-267

Simons D J amp Levin D T (1998) Failure to detect changes to peo-ple during a real-world interaction Psychonomic Bulletin amp Review5 644-649

Sperling G (1960) The information available in brief visual presen-tations Psychological Monographs 74(11 Whole No 498)

Van Dijk T amp Kintsch W (1983) Strategies of discourse compre-hension New York Academic Press

Wallis G amp Bulthoff H (2000) Whatrsquos scene and not seen Influ-ences of movement and task upon what we see Visual Cognition 7175-190

Williams P amp Simons D J (2000) Detecting changes in novel 3Dobjects Effects of change magnitude spatiotemporal continuity andstimulus familiarity Visual Cognition 7 297-322

Zelinsky G J (1999) Precuing target location in a variable set sizeldquononsearchrdquo task Dissociating search-based and interference-basedexplanations for set size effects Journal of Experimental Psychol-ogy Human Perception amp Performance 25 875-903

Zelinsky G J (2001) Eye movements during change detection Im-

EYE MOVEMENTS AND MEMORY 895

plications for search constraints memory limitations and scanningstrategies Perception amp Psychophysics 63 209-225

Zelinsky G J amp Loschky L (1998) Toward a realistic assessmentof visual working memory Investigative Ophthalmology amp VisualScience 39 S224

Zelinsky G J Rao R Hayhoe M amp Ballard D (1997) Eyemovements reveal the spatiotemporal dynamics of visual search Psy-chological Science 8 448-453

NOTES

1 The percentage of f ixations summed across positions in Figure 6does not add up to 100 because some fixations did not fall on any ob-ject position This happened relatively rarely for Fixations 3ndash15 rang-ing from 3 to 9 of all the fixations It happened quite often for Fix-ation 2 (the first real f ixation on the scene since Fixation 1 was on thefixation point) accounting for 45 of all the fixations Approximately

67 of these fell near an object position whereas the remainder werescattered randomly in the central area of the scene these may have beenldquocenter of massrdquo fixations like those previously observed by ZelinskyRao Hayhoe and Ballard (1997)

2 In order to determine how many objects would have to be stored inmemory to produce an accuracy level of 80 we first to correct forguessing (Busey amp Loftus 1994) applied the formula p 5 (x 2 g)(1 2g) where x is the raw proportion correct g is the guessing probabilityand p is the corrected proportion correct (note that g 5 143 or 17 be-cause there were always seven objects in the array) We then multipliedp by the number of objects in the display in order to estimate the num-ber of objects remembered (Sperling 1960)

(Manuscript received March 9 2001revision accepted for publication November 28 2001)

890 IRWIN AND ZELINSKY

foveated If the object that had been foveated just beforethe critical saccade (ie the saccade that terminated thescene) happened to be probed for report (denoted as fix-ation recency 0 in the figure) the subjects reported itcorrectly on over 90 of the trials If the object probedfor report had been foveated 1 or 2 fixations earlier (cor-responding to fixation recencies 21 and 22 in the fig-ure) accuracy was slightly lower (approximately 80)Probing an object that had been foveated 3 or more fix-ations before the critical saccade resulted in accuraciesthat were considerably lower (approximately 65) andnot much different from the accuracy obtained when anonfoveated object had been probed for report (59)The effect of fixation recency (using six levels of re-cency) was significant [F(525) 5 285 MSe 5 2432p 05] in a one-way repeated measures ANOVA thatwas conducted on the data from Version 2 of the experi-ment (this version allowed 3 9 or 15 fixations on thedisplay and thus provided a wide range of fixation re-cencies) A planned comparison based on the error termof the ANOVA showed that differences of 145 weresignificant at the 05 level thus accuracy for the threeobjects foveated just before the critical saccade (iewith fixation recencies of 0 21 and 22) was signifi-cantly higher than accuracy for objects foveated 3 ormore fixations back These results suggest that fixationrecency does influence accuracy in particular it appearsthat people remember the last three objects foveated inthe scene much better than they remember objectsfoveated earlier in time (see Zelinsky amp Loschky 1998for similar results) There was no evidence of a primacyeffect if the object probed for report was the first objectfoveated (and had been foveated 3 or more fixations be-fore the critical saccade) accuracy was 66

Accuracy as a Function of the Number ofObjects Foveated

Given the importance of foveation to scene memorythat is revealed by the preceding analyses one might ex-pect that overall accuracy would depend on the numberof unique objects that were foveated in the scene (recallthat 7 objects appeared in the scene) This relationship isshown in Figure 9 Accuracy increased as 0ndash6 objectswere foveated but there was no additional increase in ac-curacy from 6 to 7 objects Accuracy peaked at 80 evenwhen 6 or 7 unique objects were foveated An accuracylevel of 80 corresponds to memory for 54 objects inthe scene2 This result suggests that scene representa-tions contain no more than about f ive object positionpairings regardless of the number of f ixations on thescene This is not to say that 5 discrete objects in all theirfull detail are stored on every trial however rather thisestimate represents the average amount of informationstored from what is most likely a stochastic processThus it is probably most accurate to say that approxi-mately 5 objectsrsquo ldquoworthrdquo of information is stored in theon-line scene representation

Accuracy When Final Saccade Was Directed atthe Target Versus Directed Elsewhere

Previous research has shown that movements of atten-tion precede movements of the eyes to locations in spacethereby facilitating the processing of information that ispresent at the saccade target location (eg Deubel ampSchneider 1996 Henderson 1993 Henderson Pollat-sek amp Rayner 1989 Hoffman amp Subramaniam 1995Irwin amp Gordon 1998 Klein 1980 Klein amp Pontefract1994 Kowler Anderson Dosher amp Blaser 1995Rayner McConkie amp Ehrlich 1978 Shepherd Findlay

0 1 2 3 4 5 6 7

100

90

80

70

60

50

40

30

20

10

0

Number of Objects Foveated

Per

cent

age

Cor

rect

Figure 9 Mean accuracy as a function of the number of unique objects that were foveatedduring scene presentation (an error bar represents the standard error of the mean)

EYE MOVEMENTS AND MEMORY 891

amp Hockey 1986) Thus we expected that accuracywould be higher when the critical saccade was directedat the object that happened to be probed on any giventrial as compared with when the critical saccade was di-rected toward an unprobed object To elaborate considera one-fixation trial in which the baby bottle was pre-sented in the leftmost position of the display Supposethat the subject moved hisher eyes from the central fix-ation box toward this position Because this was a one-fixation trial the scene disappeared as soon as the sac-cade was initiated to the bottle so nothing was presenton the screen when the eyes landed A short time laterthe probe was presented at one of the display positionsand on some trials it just happened to be presented at thelocation to which the eyes were sent on the critical sac-cade the question of interest is whether accuracy washigher when the probed item happened to be the itemthat the eyes were sent to on the critical saccade thanwhen it was not

Figure 10 shows accuracy for trials in which theprobed item was or was not the target of the critical sac-cade in each fixation condition (by target of the saccadewe mean that the eyes landed within the virtual 24ordmsquare bounding box that defined object position) Onlytrials in which the probed item was not fixated at anytime during scene presentation are included In everyfixation condition accuracy was much higher when theprobed item was the target of the final saccade than whenit was not [778 vs 309 in the 1-f ixation caset(5) 5 26 p 05 929 vs 515 in the 3-fixationcase t(11) 5 60 p 001 879 vs 513 in the 5-fixation case t(5) 5 36 p 02 998 vs 658 in the9-fixation case t(5) 5 102 p 001 and 983 vs751 in the 15-fixation case t(5) 5 49 p 005]

In sum accuracy at reporting the probed item wasvery high if the probe happened to be presented at the lo-cation to which the eyes had been sent on the critical sac-cade even though the object at that location had neverbeen foveated during the trial merely being the target ofthe final saccade increased its memorability consider-ably (see Henderson amp Hollingworth 1999b for similarfindings) This presumably occurred because attentionpreceded the eyes to the saccade target location The ef-fect of attention is to increase the likelihood that infor-mation at the saccade target location will be encodedinto onersquos mental representation of the scene In fact itis possible that many of the beneficial effects of foveat-ing an object discussed above are not due to foveationper se but rather to the attention shift that preceded it

DISCUSSION

In the last 20 years there has been an explosion of in-terest in the properties of on-line scene representation(eg Aginsky amp Tarr 2000 Blackmore et al 1995De Graef 1992 Grimes 1996 Hayhoe 2000 Hender-son amp Hollingworth 1999a 1999b Hollingworth ampHenderson 2000 2002 Hollingworth et al 2001 In-traub 1997 Irwin 1991 1992 Irwin amp Andrews 1996Irwin et al 1988 Irwin et al 1983 Irwin Zacks ampBrown 1990 Levin amp Simons 1997 McConkie amp Cur-rie 1996 Mondy amp Coltheart 2000 OrsquoRegan 1992OrsquoRegan amp Levy-Schoen 1983 Pringle Irwin Krameramp Atchley 2001 Rayner amp Pollatsek 1983 Rensink2000 Rensink et al 1997 Simons 1996 2000 Simonsamp Levin 1998 Wallis amp Bulthoff 2000) One commonfinding is that scene representations appear to be sparseand undetailed Consistent with previous research the

1 3 5 9

100

90

80

70

60

50

40

30

20

10

0

Number of Fixations

Per

cent

age

Cor

rect

15

DirectedNot Directed

Figure 10 Mean accuracy when the final saccade was directed at the target versusnot directed at the target as a function of the number of fixations on the scene (anerror bar represents the standard error of the mean)

892 IRWIN AND ZELINSKY

results of our experiment also indicate that scene repre-sentations are relatively sparse even after 15 fixationson a scene our subjects remembered the positionidentitypairings of only about 78 of the objects in our seven-object displays or the equivalent of about five objectsrsquoworth of information But more important our resultsprovide new information about what factors influencethe contents of the on-line scene representation and howthis representation changes during the course of sceneviewing In particular our analyses suggest that the lastthree items that are foveated and the item about to befoveated are actually remembered quite well much bet-ter than other items in a scene In other words on-linescene representations are dynamic so that the memorytraces of recently fixated objects are much stronger thanthose of objects viewed several fixations back in timeInformation about a scene does appear to accumulateover multiple fixations but the capacity of the on-linescene representation appears to be limited to about fiveitems In most respects the results of the present studyare consistent with the results of other studies that haveused the transsaccadic partial-report technique to inves-tigate integration across saccades with nonscene dis-plays (eg Irwin 1992 Irwin amp Andrews 1996 Irwinamp Gordon 1998)

What is the nature of the memory representation under-lying performance in our task Given that the same sevenobjects appeared in fixed positions on each trial onemight imagine that subjects would simply adopt a left-to-right reading strategy and would store an ordered list ofobject names in verbal working memory Then when theprobe was presented they would scan their memorizedlist and report the appropriate item Three aspects of ourdata seem inconsistent with this hypothesis howeverFirst as was noted earlier for the most part the subjectsdid not use a left-to-right reading strategy nor did theyeven adopt any consistent strategy across trials Thisvariability would greatly complicate memory retrieval ifthe subjects had stored an ordered list of object names inverbal working memory because there would be no con-sistent mapping between probe position and the positionof an item in verbal working memory That is if thememory probe appeared at Position 6 say this wouldcorrespond to the sixth position in the list if the subjectscanned the array from left to right the second positionin the list if the subject scanned from right to left thethird position if the subject started in the middle and thenmoved right and so on The fact that the subjects did notadopt a fixed scanning order suggests to us that theywere not storing an ordered list of object names in ver-bal working memory Second the fact that the subjectsremembered only a maximum of five objects from thedisplay also seems inconsistent with the verbal workingmemory hypothesis because the capacity of verbalworking memory is generally assumed to be betweenfive and nine items given 5 sec to study the display wewould think that the subjects would be able to remember

more than five items if they were using verbal workingmemory Finally the absence of a primacy effect in ourdata also seems inconsistent with the verbal workingmemory hypothesis

Instead we believe that our results are more consistentwith the object-f ile theory of transsaccadic memory(Irwin 1992 1996 Irwin amp Andrews 1996) This theoryis based on the theoretical framework for object percep-tion proposed by Kahneman and Treisman (1984) andKahneman Treisman and Gibbs (1992) It contains fourlevels of representation feature maps which register in-dependently the presence of different sensory features ina scene such as color and shape a master map of loca-tions which registers where in a scene features are lo-cated temporary object representations called objectfiles which are formed when features are conjoined intounitary wholes via attention and which thus representwhat objects are located where in a scene and an abstractlong-term recognition network that stores descriptions ofobjects along with their names Object files may containvisual verbal and semantic information about objectsbecause they have access to long-term memory as well asto perceptual input According to the object-file theoryof transsaccadic memory relatively little information ispreserved across eye movements Rather transsaccadicmemory (and by extension the on-line scene representa-tion) consists of 3mdash5 object files that contain positionand identity information for items that have been at-tended in the scene and of residual activation in the fea-ture maps and in the long-term recognition network Theresults of the present study are consistent with the object-file theory because even after 15 fixations on a scenememory for the scene was limited and consisted largelyof the last few objects that had been attended in the scene

Although the results of the present study are consis-tent with the object-file theory they seem not entirelyconsistent with another theory of scene representationproposed recently by Rensink (2000) According toRensinkrsquos coherence theory in the absence of focusedattention volatile low-level proto-objects are continuallybeing formed and replaced rapidly and in parallel acrossthe visual field reflecting whatever is present on theretina at any moment in time Focused attention providesa coherence field that selects a small number of theseproto-objects to form a stable individuated object thathas spatiotemporal continuity This stable object doesnot necessarily correspond to a single element in the vi-sual field however but may represent a supra-object (cfPylyshynrsquos FINSTs Pylyshyn amp Storm 1988) After fo-cused attention is released the object loses its coherenceand dissolves back into its constituent proto-object partsOnce that occurs there is little or no consequence of hav-ing been attended According to coherence theory visualshort-term memory corresponds to the coherence fieldit contains object tokens (ie specific instantiations at aparticular place and time) that are currently in the focusof attention but that disappear when attention is with-

EYE MOVEMENTS AND MEMORY 893

drawn The coherence theory does allow for the exis-tence of a nonvisual short-term memory for previouslyattended items but it posits that this memory containsinformation only about object types (general definitionsabstracted away from specific spatiotemporal contexts)

The results of the present study seem to raise problemsfor the coherence theory because object tokens were re-tained in memory for recently fixated objects that wereno longer in the focus of attention Consistent with co-herence theory we found that accuracy was very highwhen the object at the final saccade target location(hence the object at the focus of attention) was probedfor report However we also found that accuracy wasquite high for the three objects that had been fixated justprior to this Since attention movements precede eyemovements in an obligatory fashion the three objectsviewed in prior fixations could not also be at the focus ofattention The fact that accuracy was not uniform for thefour objects in question also indicates that they were notall simultaneously at the focus of attention Rather theobjects viewed during the fixations just preceding thefinal saccade benefited from prior attentional process-ing This is inconsistent with the claim that once focusedattention is released objects dissolve back into their con-stituent proto-objects with no consequence of havingbeen attended Note also that object tokens and not ob-ject types were being remembered in our experiment inorder to respond correctly to the partial-report probe thesubjects had to remember the position and the identityof the probed itemmdashthat is they had to remember an ob-ject token Merely remembering that an object type hadappeared somewhere in the display would not lead to anaccurate response especially since the same seven itemsappeared on every trial

Although the present study provides important infor-mation about the factors that influence the contents of on-line scene representations two limitations should benoted One is that our study measured only short-termmemory contributions to the on-line scene representationand not potential long-term memory contributions Thesame seven objects appeared in the same scene context onevery trial so top-down contributions to performancewere essentially eliminated In the real world scene rep-resentations are bound to be influenced by knowledgeabout what kinds of objects are typically found in a par-ticular scene context and where those objects are usuallylocated Indeed recent research by Hollingworth andHenderson (2002 see also Hollingworth et al 2001) inwhich a novel saccade-contingent change detection pro-cedure was used has demonstrated that long-term mem-ory plays a crucial role in the construction and mainte-nance of on-line scene representations

A second limitation of our study is that it involvedonly a test of explicit memory Recently several investi-gators have found evidence that explicit tests of scenememory may underestimate the amount of informationin the representation (eg Fernandez-Duque amp Thornton2000 Hollingworth et al 2001 Williams amp Simons

2000) For example Hollingworth et al found that chang-ing an object in a scene often affected fixation durationon that object even though subjects failed to report thata change had occurred This suggests that at some levelthe perceptual system detected the change even thoughthe subjects were not explicitly aware of it It is impor-tant to note that the implicit effects that have been ob-served have often been small however Furthermoreeven though implicit measures of performance (eg eyemovements) appear to be more sensitive to change thanare explicit measures it does not follow from this thatscene representations contain a great deal more infor-mation than has been measured by explicit tests Explicittests may underestimate the amount of information in thescene representation but there is no evidence that theyunderestimate it by much This is still an open question

REFERENCES

Aginsky V amp Tarr M (2000) How are different properties of a sceneencoded in visual memory Visual Cognition 7 147-162

Averbach E amp Coriell E (1961) Short-term memory in visionBell System Technical Journal 40 309-328

Blackmore S J Brelstaff G Nelson K amp Troscianko T(1995) Is the richness of our visual world an illusion Transsaccadicmemory for complex scenes Perception 24 1075-1081

Bridgeman B Hendry D amp Stark L (1975) Failure to detect dis-placement of the visual world during saccadic eye movements Vi-sion Research 15 719-722

Bridgeman B amp Mayer M (1983) Failure to integrate visual infor-mation from successive fixations Bulletin of the Psychonomic Soci-ety 21 285-286

Bridgeman B van der Heijden A amp Velichkovsky B (1994) Atheory of visual stability across saccadic eye movements Behavioralamp Brain Sciences 17 247-292

Busey T amp Loftus G (1994) Sensory and cognitive components ofvisual information acquisition Psychological Review 101 446-469

Currie C B McConkie G W Carlson-Radvansky L A ampIrwin D E (2000) The role of the saccade target object in the per-ception of a visually stable world Perception amp Psychophysics 62673-683

De Graef P (1992) Scene-context effects and models of real-worldperception In K Rayner (Ed) Eye movements and visual cognitionScene perception and reading (pp 243-259) New York Springer-Verlag

Deubel H amp Schneider W X (1996) Saccade target selection andobject recognition Evidence for a common attentional mechanismVision Research 36 1827-1837

Fernandez-Duque D amp Thornton I M (2000) Change detectionwithout awareness Do explicit reports underestimate the representa-tion of change in the visual system Visual Cognition 7 324-344

Grimes J (1996) On the failure to detect changes in scenes across sac-cades In K Akins (Ed) Perception (Vancouver Studies in CognitiveScience Vol 5 pp 89-110) New York Oxford University Press

Hayhoe M M (2000) Vision using routines A functional account ofvision Visual Cognition 7 43-64

Henderson J M (1993) Visual attention and saccadic eye move-ments In G drsquoYdewalle amp J Van Rensbergen (Eds) Perception andcognition Advances in eye-movement research (pp 37-50) Amster-dam North-Holland

Henderson J M (1997) Transsaccadic memory and integration dur-ing real-world object perception Psychological Science 8 51-55

Henderson J M amp Hollingworth A (1999a) High-level sceneperception Annual Review of Psychology 50 243-271

Henderson J M amp Hollingworth A (1999b) The role of fixationposition in detecting scene changes across saccades PsychologicalScience 10 438-443

894 IRWIN AND ZELINSKY

Henderson J M Pollatsek A amp Rayner K (1989) Covert visualattention and extrafoveal information use during object identifica-tion Perception amp Psychophysics 45 196-208

Hochberg J (1986) Representation of motion and space in video andcinematic displays In K R Boff L Kaufman amp J P Thomas (Eds)Handbook of perception and human performance Vol 1 Sensoryprocesses and perception (pp 221-2264) New York Wiley

Hoffman J E amp Subramaniam B (1995) The role of visual atten-tion in saccadic eye movements Perception amp Psychophysics 57787-795

Hollingworth A amp Henderson J M (2000) Semantic informa-tiveness mediates the detection of changes in natural scenes VisualCognition 7 213-235

Hollingworth A amp Henderson J M (2002) Accurate visualmemory for previously attended objects in natural scenes Journal ofExperimental Psychology Human Perception amp Performance 28113-136

Hollingworth A Williams C C amp Henderson J M (2001) Tosee and remember Visually specific information is retained in mem-ory from previously attended objects in natural scenes PsychonomicBulletin amp Review 8 761-768

Intraub H (1997) The representation of visual scenes Trends in Cog-nitive Sciences 1 217-222

Irwin D E (1991) Information integration across saccadic eye move-ments Cognitive Psychology 23 420-456

Irwin D E (1992) Memory for position and identity across eye move-ments Journal of Experimental Psychology Learning Memory ampCognition 18 307-317

Irwin D E (1996) Integrating information across saccadic eye move-ments Current Directions in Psychological Science 5 94-100

Irwin D E amp Andrews R V (1996) Integration and accumulationof information across saccadic eye movements In T Inui amp J L Mc-Clelland (Eds) Attention and performance XVI Information inte-gration in perception and communication (pp 125-155) CambridgeMA MIT Press

Irwin D E Brown J S amp Sun J-S (1988) Visual masking and vi-sual integration across saccadic eye movements Journal of Experi-mental Psychology General 117 276-287

Irwin D E amp Gordon R D (1998) Eye movements attention andtranssaccadic memory Visual Cognition 5 127-155

Irwin D E Yantis S amp Jonides J (1983) Evidence against visualintegration across saccadic eye movements Perception amp Psycho-physics 34 49-57

Irwin D E Zacks J L amp Brown J S (1990) Visual memory andthe perception of a stable visual environment Perception amp Psycho-physics 47 35-46

Kahneman D amp Treisman A (1984) Changing views of attentionand automaticity In R Parasuraman amp D R Davies (Eds) Varietiesof attention (pp 29-61) Orlando FL Academic Press

Kahneman D Treisman A amp Gibbs B J (1992) The reviewing ofobject f iles Object-specific integration of information CognitivePsychology 24 175-219

Klein R (1980) Does oculomotor readiness mediate cognitive controlof visual attention In R S Nickerson (Ed) Attention and perfor-mance VIII (pp 259-276) Hillsdale NJ Erlbaum

Klein R amp Pontefract A (1994) Does oculomotor readiness me-diate cognitive control of visual attention Revisited In C Umiltagrave ampM Moskovitch (Eds) Attention and performance XV Consciousand nonconscious information processing (pp 333-350) CambridgeMA MIT Press Bradford Books

Kowler E Anderson E Dosher B amp Blaser E (1995) The roleof attention in the programming of saccades Vision Research 351897-1916

Levin D T amp Simons D J (1997) Failure to detect changes to at-tended objects in motion pictures Psychonomic Bulletin amp Review4 501-506

Matin E (1974) Saccadic suppression A review and an analysis Psy-chological Bulletin 81 899-917

McConkie G W amp Currie C B (1996) Visual stability across sac-cades while viewing complex pictures Journal of Experimental Psy-chology Human Perception amp Performance 22 563-581

McConkie G W amp Zola D (1979) Is visual information integratedacross successive fixations in reading Perception amp Psychophysics25 221-224

Mewhort D J K Campbell A J Marchetti F M amp CampbellJ I D (1981) Identif ication localization and ldquoiconic memoryrdquo Anevaluation of the bar-probe task Memory amp Cognition 9 50-67

Mondy S amp Coltheart V (2000) Detection and identification ofchange in naturalistic scenes Visual Cognition 7 281-296

OrsquoRegan J K (1992) Solving the ldquorealrdquo mysteries of visual percep-tion The world as an outside memory Canadian Journal of Psy-chology 46 461-488

OrsquoRegan J K amp Levy-Schoen A (1983) Integrating visual infor-mation from successive fixations Does trans-saccadic fusion existVision Research 23 765-768

Pan K amp Eriksen C W (1993) Attentional distribution in the visualfield during samendashdifferent judgments as assessed by response com-petition Perception amp Psychophysics 53 134-144

Pashler H (1988) Familiarity and visual change detection Percep-tion amp Psychophysics 44 369-378

Phillips W A (1974) On the distinction between sensory storage andshort-term visual memory Perception amp Psychophysics 16 283-290

Pollatsek A Rayner K amp Henderson J (1990) The role of spa-tial location in the integration of pictorial information across sac-cades Journal of Experimental Psychology Human Perception ampPerformance 16 199-210

Pringle H L Irwin D E Kramer A F amp Atchley P (2001)The role of attentional breadth in perceptual change detection Psy-chonomic Bulletin amp Review 8 89-95

Pylyshyn Z W amp Storm R W (1988) Tracking multiple indepen-dent targets Evidence for a parallel tracking mechanism Spatial Vi-sion 3 179-197

Rayner K McConkie G W amp Ehrlich S (1978) Eye movement sand integrating information across f ixations Journal of Experimen-tal Psychology Human Perception amp Performance 4 529-544

Rayner K McConkie G W amp Zola D (1980) Integrating infor-mation across eye movements Cognitive Psychology 12 206-226

Rayner K amp Pollatsek A (1983) Is visual information integratedacross saccades Perception amp Psychophysics 34 39-48

Rensink R A (2000) The dynamic representation of scenes VisualCognition 7 17-42

Rensink R A OrsquoRegan J K amp Clark J J (1997) To see or not tosee The need for attention to perceive changes in scenes Psycho-logical Science 8 368-373

Shepherd M Findlay J amp Hockey R (1986) The relationship be-tween eye movements and spatial attention Quarterly Journal of Ex-perimental Psychology 38A 475-491

Simons D J (1996) In sight out of mind When object representa-tions fail Psychological Science 7 301-305

Simons D J (2000) Current approaches to change blindness VisualCognition 7 1-15

Simons D J amp Levin D T (1997) Change blindness Trends in Cog-nitive Sciences 1 261-267

Simons D J amp Levin D T (1998) Failure to detect changes to peo-ple during a real-world interaction Psychonomic Bulletin amp Review5 644-649

Sperling G (1960) The information available in brief visual presen-tations Psychological Monographs 74(11 Whole No 498)

Van Dijk T amp Kintsch W (1983) Strategies of discourse compre-hension New York Academic Press

Wallis G amp Bulthoff H (2000) Whatrsquos scene and not seen Influ-ences of movement and task upon what we see Visual Cognition 7175-190

Williams P amp Simons D J (2000) Detecting changes in novel 3Dobjects Effects of change magnitude spatiotemporal continuity andstimulus familiarity Visual Cognition 7 297-322

Zelinsky G J (1999) Precuing target location in a variable set sizeldquononsearchrdquo task Dissociating search-based and interference-basedexplanations for set size effects Journal of Experimental Psychol-ogy Human Perception amp Performance 25 875-903

Zelinsky G J (2001) Eye movements during change detection Im-

EYE MOVEMENTS AND MEMORY 895

plications for search constraints memory limitations and scanningstrategies Perception amp Psychophysics 63 209-225

Zelinsky G J amp Loschky L (1998) Toward a realistic assessmentof visual working memory Investigative Ophthalmology amp VisualScience 39 S224

Zelinsky G J Rao R Hayhoe M amp Ballard D (1997) Eyemovements reveal the spatiotemporal dynamics of visual search Psy-chological Science 8 448-453

NOTES

1 The percentage of f ixations summed across positions in Figure 6does not add up to 100 because some fixations did not fall on any ob-ject position This happened relatively rarely for Fixations 3ndash15 rang-ing from 3 to 9 of all the fixations It happened quite often for Fix-ation 2 (the first real f ixation on the scene since Fixation 1 was on thefixation point) accounting for 45 of all the fixations Approximately

67 of these fell near an object position whereas the remainder werescattered randomly in the central area of the scene these may have beenldquocenter of massrdquo fixations like those previously observed by ZelinskyRao Hayhoe and Ballard (1997)

2 In order to determine how many objects would have to be stored inmemory to produce an accuracy level of 80 we first to correct forguessing (Busey amp Loftus 1994) applied the formula p 5 (x 2 g)(1 2g) where x is the raw proportion correct g is the guessing probabilityand p is the corrected proportion correct (note that g 5 143 or 17 be-cause there were always seven objects in the array) We then multipliedp by the number of objects in the display in order to estimate the num-ber of objects remembered (Sperling 1960)

(Manuscript received March 9 2001revision accepted for publication November 28 2001)

EYE MOVEMENTS AND MEMORY 891

amp Hockey 1986) Thus we expected that accuracywould be higher when the critical saccade was directedat the object that happened to be probed on any giventrial as compared with when the critical saccade was di-rected toward an unprobed object To elaborate considera one-fixation trial in which the baby bottle was pre-sented in the leftmost position of the display Supposethat the subject moved hisher eyes from the central fix-ation box toward this position Because this was a one-fixation trial the scene disappeared as soon as the sac-cade was initiated to the bottle so nothing was presenton the screen when the eyes landed A short time laterthe probe was presented at one of the display positionsand on some trials it just happened to be presented at thelocation to which the eyes were sent on the critical sac-cade the question of interest is whether accuracy washigher when the probed item happened to be the itemthat the eyes were sent to on the critical saccade thanwhen it was not

Figure 10 shows accuracy for trials in which theprobed item was or was not the target of the critical sac-cade in each fixation condition (by target of the saccadewe mean that the eyes landed within the virtual 24ordmsquare bounding box that defined object position) Onlytrials in which the probed item was not fixated at anytime during scene presentation are included In everyfixation condition accuracy was much higher when theprobed item was the target of the final saccade than whenit was not [778 vs 309 in the 1-f ixation caset(5) 5 26 p 05 929 vs 515 in the 3-fixationcase t(11) 5 60 p 001 879 vs 513 in the 5-fixation case t(5) 5 36 p 02 998 vs 658 in the9-fixation case t(5) 5 102 p 001 and 983 vs751 in the 15-fixation case t(5) 5 49 p 005]

In sum accuracy at reporting the probed item wasvery high if the probe happened to be presented at the lo-cation to which the eyes had been sent on the critical sac-cade even though the object at that location had neverbeen foveated during the trial merely being the target ofthe final saccade increased its memorability consider-ably (see Henderson amp Hollingworth 1999b for similarfindings) This presumably occurred because attentionpreceded the eyes to the saccade target location The ef-fect of attention is to increase the likelihood that infor-mation at the saccade target location will be encodedinto onersquos mental representation of the scene In fact itis possible that many of the beneficial effects of foveat-ing an object discussed above are not due to foveationper se but rather to the attention shift that preceded it

DISCUSSION

In the last 20 years there has been an explosion of in-terest in the properties of on-line scene representation(eg Aginsky amp Tarr 2000 Blackmore et al 1995De Graef 1992 Grimes 1996 Hayhoe 2000 Hender-son amp Hollingworth 1999a 1999b Hollingworth ampHenderson 2000 2002 Hollingworth et al 2001 In-traub 1997 Irwin 1991 1992 Irwin amp Andrews 1996Irwin et al 1988 Irwin et al 1983 Irwin Zacks ampBrown 1990 Levin amp Simons 1997 McConkie amp Cur-rie 1996 Mondy amp Coltheart 2000 OrsquoRegan 1992OrsquoRegan amp Levy-Schoen 1983 Pringle Irwin Krameramp Atchley 2001 Rayner amp Pollatsek 1983 Rensink2000 Rensink et al 1997 Simons 1996 2000 Simonsamp Levin 1998 Wallis amp Bulthoff 2000) One commonfinding is that scene representations appear to be sparseand undetailed Consistent with previous research the

1 3 5 9

100

90

80

70

60

50

40

30

20

10

0

Number of Fixations

Per

cent

age

Cor

rect

15

DirectedNot Directed

Figure 10 Mean accuracy when the final saccade was directed at the target versusnot directed at the target as a function of the number of fixations on the scene (anerror bar represents the standard error of the mean)

892 IRWIN AND ZELINSKY

results of our experiment also indicate that scene repre-sentations are relatively sparse even after 15 fixationson a scene our subjects remembered the positionidentitypairings of only about 78 of the objects in our seven-object displays or the equivalent of about five objectsrsquoworth of information But more important our resultsprovide new information about what factors influencethe contents of the on-line scene representation and howthis representation changes during the course of sceneviewing In particular our analyses suggest that the lastthree items that are foveated and the item about to befoveated are actually remembered quite well much bet-ter than other items in a scene In other words on-linescene representations are dynamic so that the memorytraces of recently fixated objects are much stronger thanthose of objects viewed several fixations back in timeInformation about a scene does appear to accumulateover multiple fixations but the capacity of the on-linescene representation appears to be limited to about fiveitems In most respects the results of the present studyare consistent with the results of other studies that haveused the transsaccadic partial-report technique to inves-tigate integration across saccades with nonscene dis-plays (eg Irwin 1992 Irwin amp Andrews 1996 Irwinamp Gordon 1998)

What is the nature of the memory representation under-lying performance in our task Given that the same sevenobjects appeared in fixed positions on each trial onemight imagine that subjects would simply adopt a left-to-right reading strategy and would store an ordered list ofobject names in verbal working memory Then when theprobe was presented they would scan their memorizedlist and report the appropriate item Three aspects of ourdata seem inconsistent with this hypothesis howeverFirst as was noted earlier for the most part the subjectsdid not use a left-to-right reading strategy nor did theyeven adopt any consistent strategy across trials Thisvariability would greatly complicate memory retrieval ifthe subjects had stored an ordered list of object names inverbal working memory because there would be no con-sistent mapping between probe position and the positionof an item in verbal working memory That is if thememory probe appeared at Position 6 say this wouldcorrespond to the sixth position in the list if the subjectscanned the array from left to right the second positionin the list if the subject scanned from right to left thethird position if the subject started in the middle and thenmoved right and so on The fact that the subjects did notadopt a fixed scanning order suggests to us that theywere not storing an ordered list of object names in ver-bal working memory Second the fact that the subjectsremembered only a maximum of five objects from thedisplay also seems inconsistent with the verbal workingmemory hypothesis because the capacity of verbalworking memory is generally assumed to be betweenfive and nine items given 5 sec to study the display wewould think that the subjects would be able to remember

more than five items if they were using verbal workingmemory Finally the absence of a primacy effect in ourdata also seems inconsistent with the verbal workingmemory hypothesis

Instead we believe that our results are more consistentwith the object-f ile theory of transsaccadic memory(Irwin 1992 1996 Irwin amp Andrews 1996) This theoryis based on the theoretical framework for object percep-tion proposed by Kahneman and Treisman (1984) andKahneman Treisman and Gibbs (1992) It contains fourlevels of representation feature maps which register in-dependently the presence of different sensory features ina scene such as color and shape a master map of loca-tions which registers where in a scene features are lo-cated temporary object representations called objectfiles which are formed when features are conjoined intounitary wholes via attention and which thus representwhat objects are located where in a scene and an abstractlong-term recognition network that stores descriptions ofobjects along with their names Object files may containvisual verbal and semantic information about objectsbecause they have access to long-term memory as well asto perceptual input According to the object-file theoryof transsaccadic memory relatively little information ispreserved across eye movements Rather transsaccadicmemory (and by extension the on-line scene representa-tion) consists of 3mdash5 object files that contain positionand identity information for items that have been at-tended in the scene and of residual activation in the fea-ture maps and in the long-term recognition network Theresults of the present study are consistent with the object-file theory because even after 15 fixations on a scenememory for the scene was limited and consisted largelyof the last few objects that had been attended in the scene

Although the results of the present study are consis-tent with the object-file theory they seem not entirelyconsistent with another theory of scene representationproposed recently by Rensink (2000) According toRensinkrsquos coherence theory in the absence of focusedattention volatile low-level proto-objects are continuallybeing formed and replaced rapidly and in parallel acrossthe visual field reflecting whatever is present on theretina at any moment in time Focused attention providesa coherence field that selects a small number of theseproto-objects to form a stable individuated object thathas spatiotemporal continuity This stable object doesnot necessarily correspond to a single element in the vi-sual field however but may represent a supra-object (cfPylyshynrsquos FINSTs Pylyshyn amp Storm 1988) After fo-cused attention is released the object loses its coherenceand dissolves back into its constituent proto-object partsOnce that occurs there is little or no consequence of hav-ing been attended According to coherence theory visualshort-term memory corresponds to the coherence fieldit contains object tokens (ie specific instantiations at aparticular place and time) that are currently in the focusof attention but that disappear when attention is with-

EYE MOVEMENTS AND MEMORY 893

drawn The coherence theory does allow for the exis-tence of a nonvisual short-term memory for previouslyattended items but it posits that this memory containsinformation only about object types (general definitionsabstracted away from specific spatiotemporal contexts)

The results of the present study seem to raise problemsfor the coherence theory because object tokens were re-tained in memory for recently fixated objects that wereno longer in the focus of attention Consistent with co-herence theory we found that accuracy was very highwhen the object at the final saccade target location(hence the object at the focus of attention) was probedfor report However we also found that accuracy wasquite high for the three objects that had been fixated justprior to this Since attention movements precede eyemovements in an obligatory fashion the three objectsviewed in prior fixations could not also be at the focus ofattention The fact that accuracy was not uniform for thefour objects in question also indicates that they were notall simultaneously at the focus of attention Rather theobjects viewed during the fixations just preceding thefinal saccade benefited from prior attentional process-ing This is inconsistent with the claim that once focusedattention is released objects dissolve back into their con-stituent proto-objects with no consequence of havingbeen attended Note also that object tokens and not ob-ject types were being remembered in our experiment inorder to respond correctly to the partial-report probe thesubjects had to remember the position and the identityof the probed itemmdashthat is they had to remember an ob-ject token Merely remembering that an object type hadappeared somewhere in the display would not lead to anaccurate response especially since the same seven itemsappeared on every trial

Although the present study provides important infor-mation about the factors that influence the contents of on-line scene representations two limitations should benoted One is that our study measured only short-termmemory contributions to the on-line scene representationand not potential long-term memory contributions Thesame seven objects appeared in the same scene context onevery trial so top-down contributions to performancewere essentially eliminated In the real world scene rep-resentations are bound to be influenced by knowledgeabout what kinds of objects are typically found in a par-ticular scene context and where those objects are usuallylocated Indeed recent research by Hollingworth andHenderson (2002 see also Hollingworth et al 2001) inwhich a novel saccade-contingent change detection pro-cedure was used has demonstrated that long-term mem-ory plays a crucial role in the construction and mainte-nance of on-line scene representations

A second limitation of our study is that it involvedonly a test of explicit memory Recently several investi-gators have found evidence that explicit tests of scenememory may underestimate the amount of informationin the representation (eg Fernandez-Duque amp Thornton2000 Hollingworth et al 2001 Williams amp Simons

2000) For example Hollingworth et al found that chang-ing an object in a scene often affected fixation durationon that object even though subjects failed to report thata change had occurred This suggests that at some levelthe perceptual system detected the change even thoughthe subjects were not explicitly aware of it It is impor-tant to note that the implicit effects that have been ob-served have often been small however Furthermoreeven though implicit measures of performance (eg eyemovements) appear to be more sensitive to change thanare explicit measures it does not follow from this thatscene representations contain a great deal more infor-mation than has been measured by explicit tests Explicittests may underestimate the amount of information in thescene representation but there is no evidence that theyunderestimate it by much This is still an open question

REFERENCES

Aginsky V amp Tarr M (2000) How are different properties of a sceneencoded in visual memory Visual Cognition 7 147-162

Averbach E amp Coriell E (1961) Short-term memory in visionBell System Technical Journal 40 309-328

Blackmore S J Brelstaff G Nelson K amp Troscianko T(1995) Is the richness of our visual world an illusion Transsaccadicmemory for complex scenes Perception 24 1075-1081

Bridgeman B Hendry D amp Stark L (1975) Failure to detect dis-placement of the visual world during saccadic eye movements Vi-sion Research 15 719-722

Bridgeman B amp Mayer M (1983) Failure to integrate visual infor-mation from successive fixations Bulletin of the Psychonomic Soci-ety 21 285-286

Bridgeman B van der Heijden A amp Velichkovsky B (1994) Atheory of visual stability across saccadic eye movements Behavioralamp Brain Sciences 17 247-292

Busey T amp Loftus G (1994) Sensory and cognitive components ofvisual information acquisition Psychological Review 101 446-469

Currie C B McConkie G W Carlson-Radvansky L A ampIrwin D E (2000) The role of the saccade target object in the per-ception of a visually stable world Perception amp Psychophysics 62673-683

De Graef P (1992) Scene-context effects and models of real-worldperception In K Rayner (Ed) Eye movements and visual cognitionScene perception and reading (pp 243-259) New York Springer-Verlag

Deubel H amp Schneider W X (1996) Saccade target selection andobject recognition Evidence for a common attentional mechanismVision Research 36 1827-1837

Fernandez-Duque D amp Thornton I M (2000) Change detectionwithout awareness Do explicit reports underestimate the representa-tion of change in the visual system Visual Cognition 7 324-344

Grimes J (1996) On the failure to detect changes in scenes across sac-cades In K Akins (Ed) Perception (Vancouver Studies in CognitiveScience Vol 5 pp 89-110) New York Oxford University Press

Hayhoe M M (2000) Vision using routines A functional account ofvision Visual Cognition 7 43-64

Henderson J M (1993) Visual attention and saccadic eye move-ments In G drsquoYdewalle amp J Van Rensbergen (Eds) Perception andcognition Advances in eye-movement research (pp 37-50) Amster-dam North-Holland

Henderson J M (1997) Transsaccadic memory and integration dur-ing real-world object perception Psychological Science 8 51-55

Henderson J M amp Hollingworth A (1999a) High-level sceneperception Annual Review of Psychology 50 243-271

Henderson J M amp Hollingworth A (1999b) The role of fixationposition in detecting scene changes across saccades PsychologicalScience 10 438-443

894 IRWIN AND ZELINSKY

Henderson J M Pollatsek A amp Rayner K (1989) Covert visualattention and extrafoveal information use during object identifica-tion Perception amp Psychophysics 45 196-208

Hochberg J (1986) Representation of motion and space in video andcinematic displays In K R Boff L Kaufman amp J P Thomas (Eds)Handbook of perception and human performance Vol 1 Sensoryprocesses and perception (pp 221-2264) New York Wiley

Hoffman J E amp Subramaniam B (1995) The role of visual atten-tion in saccadic eye movements Perception amp Psychophysics 57787-795

Hollingworth A amp Henderson J M (2000) Semantic informa-tiveness mediates the detection of changes in natural scenes VisualCognition 7 213-235

Hollingworth A amp Henderson J M (2002) Accurate visualmemory for previously attended objects in natural scenes Journal ofExperimental Psychology Human Perception amp Performance 28113-136

Hollingworth A Williams C C amp Henderson J M (2001) Tosee and remember Visually specific information is retained in mem-ory from previously attended objects in natural scenes PsychonomicBulletin amp Review 8 761-768

Intraub H (1997) The representation of visual scenes Trends in Cog-nitive Sciences 1 217-222

Irwin D E (1991) Information integration across saccadic eye move-ments Cognitive Psychology 23 420-456

Irwin D E (1992) Memory for position and identity across eye move-ments Journal of Experimental Psychology Learning Memory ampCognition 18 307-317

Irwin D E (1996) Integrating information across saccadic eye move-ments Current Directions in Psychological Science 5 94-100

Irwin D E amp Andrews R V (1996) Integration and accumulationof information across saccadic eye movements In T Inui amp J L Mc-Clelland (Eds) Attention and performance XVI Information inte-gration in perception and communication (pp 125-155) CambridgeMA MIT Press

Irwin D E Brown J S amp Sun J-S (1988) Visual masking and vi-sual integration across saccadic eye movements Journal of Experi-mental Psychology General 117 276-287

Irwin D E amp Gordon R D (1998) Eye movements attention andtranssaccadic memory Visual Cognition 5 127-155

Irwin D E Yantis S amp Jonides J (1983) Evidence against visualintegration across saccadic eye movements Perception amp Psycho-physics 34 49-57

Irwin D E Zacks J L amp Brown J S (1990) Visual memory andthe perception of a stable visual environment Perception amp Psycho-physics 47 35-46

Kahneman D amp Treisman A (1984) Changing views of attentionand automaticity In R Parasuraman amp D R Davies (Eds) Varietiesof attention (pp 29-61) Orlando FL Academic Press

Kahneman D Treisman A amp Gibbs B J (1992) The reviewing ofobject f iles Object-specific integration of information CognitivePsychology 24 175-219

Klein R (1980) Does oculomotor readiness mediate cognitive controlof visual attention In R S Nickerson (Ed) Attention and perfor-mance VIII (pp 259-276) Hillsdale NJ Erlbaum

Klein R amp Pontefract A (1994) Does oculomotor readiness me-diate cognitive control of visual attention Revisited In C Umiltagrave ampM Moskovitch (Eds) Attention and performance XV Consciousand nonconscious information processing (pp 333-350) CambridgeMA MIT Press Bradford Books

Kowler E Anderson E Dosher B amp Blaser E (1995) The roleof attention in the programming of saccades Vision Research 351897-1916

Levin D T amp Simons D J (1997) Failure to detect changes to at-tended objects in motion pictures Psychonomic Bulletin amp Review4 501-506

Matin E (1974) Saccadic suppression A review and an analysis Psy-chological Bulletin 81 899-917

McConkie G W amp Currie C B (1996) Visual stability across sac-cades while viewing complex pictures Journal of Experimental Psy-chology Human Perception amp Performance 22 563-581

McConkie G W amp Zola D (1979) Is visual information integratedacross successive fixations in reading Perception amp Psychophysics25 221-224

Mewhort D J K Campbell A J Marchetti F M amp CampbellJ I D (1981) Identif ication localization and ldquoiconic memoryrdquo Anevaluation of the bar-probe task Memory amp Cognition 9 50-67

Mondy S amp Coltheart V (2000) Detection and identification ofchange in naturalistic scenes Visual Cognition 7 281-296

OrsquoRegan J K (1992) Solving the ldquorealrdquo mysteries of visual percep-tion The world as an outside memory Canadian Journal of Psy-chology 46 461-488

OrsquoRegan J K amp Levy-Schoen A (1983) Integrating visual infor-mation from successive fixations Does trans-saccadic fusion existVision Research 23 765-768

Pan K amp Eriksen C W (1993) Attentional distribution in the visualfield during samendashdifferent judgments as assessed by response com-petition Perception amp Psychophysics 53 134-144

Pashler H (1988) Familiarity and visual change detection Percep-tion amp Psychophysics 44 369-378

Phillips W A (1974) On the distinction between sensory storage andshort-term visual memory Perception amp Psychophysics 16 283-290

Pollatsek A Rayner K amp Henderson J (1990) The role of spa-tial location in the integration of pictorial information across sac-cades Journal of Experimental Psychology Human Perception ampPerformance 16 199-210

Pringle H L Irwin D E Kramer A F amp Atchley P (2001)The role of attentional breadth in perceptual change detection Psy-chonomic Bulletin amp Review 8 89-95

Pylyshyn Z W amp Storm R W (1988) Tracking multiple indepen-dent targets Evidence for a parallel tracking mechanism Spatial Vi-sion 3 179-197

Rayner K McConkie G W amp Ehrlich S (1978) Eye movement sand integrating information across f ixations Journal of Experimen-tal Psychology Human Perception amp Performance 4 529-544

Rayner K McConkie G W amp Zola D (1980) Integrating infor-mation across eye movements Cognitive Psychology 12 206-226

Rayner K amp Pollatsek A (1983) Is visual information integratedacross saccades Perception amp Psychophysics 34 39-48

Rensink R A (2000) The dynamic representation of scenes VisualCognition 7 17-42

Rensink R A OrsquoRegan J K amp Clark J J (1997) To see or not tosee The need for attention to perceive changes in scenes Psycho-logical Science 8 368-373

Shepherd M Findlay J amp Hockey R (1986) The relationship be-tween eye movements and spatial attention Quarterly Journal of Ex-perimental Psychology 38A 475-491

Simons D J (1996) In sight out of mind When object representa-tions fail Psychological Science 7 301-305

Simons D J (2000) Current approaches to change blindness VisualCognition 7 1-15

Simons D J amp Levin D T (1997) Change blindness Trends in Cog-nitive Sciences 1 261-267

Simons D J amp Levin D T (1998) Failure to detect changes to peo-ple during a real-world interaction Psychonomic Bulletin amp Review5 644-649

Sperling G (1960) The information available in brief visual presen-tations Psychological Monographs 74(11 Whole No 498)

Van Dijk T amp Kintsch W (1983) Strategies of discourse compre-hension New York Academic Press

Wallis G amp Bulthoff H (2000) Whatrsquos scene and not seen Influ-ences of movement and task upon what we see Visual Cognition 7175-190

Williams P amp Simons D J (2000) Detecting changes in novel 3Dobjects Effects of change magnitude spatiotemporal continuity andstimulus familiarity Visual Cognition 7 297-322

Zelinsky G J (1999) Precuing target location in a variable set sizeldquononsearchrdquo task Dissociating search-based and interference-basedexplanations for set size effects Journal of Experimental Psychol-ogy Human Perception amp Performance 25 875-903

Zelinsky G J (2001) Eye movements during change detection Im-

EYE MOVEMENTS AND MEMORY 895

plications for search constraints memory limitations and scanningstrategies Perception amp Psychophysics 63 209-225

Zelinsky G J amp Loschky L (1998) Toward a realistic assessmentof visual working memory Investigative Ophthalmology amp VisualScience 39 S224

Zelinsky G J Rao R Hayhoe M amp Ballard D (1997) Eyemovements reveal the spatiotemporal dynamics of visual search Psy-chological Science 8 448-453

NOTES

1 The percentage of f ixations summed across positions in Figure 6does not add up to 100 because some fixations did not fall on any ob-ject position This happened relatively rarely for Fixations 3ndash15 rang-ing from 3 to 9 of all the fixations It happened quite often for Fix-ation 2 (the first real f ixation on the scene since Fixation 1 was on thefixation point) accounting for 45 of all the fixations Approximately

67 of these fell near an object position whereas the remainder werescattered randomly in the central area of the scene these may have beenldquocenter of massrdquo fixations like those previously observed by ZelinskyRao Hayhoe and Ballard (1997)

2 In order to determine how many objects would have to be stored inmemory to produce an accuracy level of 80 we first to correct forguessing (Busey amp Loftus 1994) applied the formula p 5 (x 2 g)(1 2g) where x is the raw proportion correct g is the guessing probabilityand p is the corrected proportion correct (note that g 5 143 or 17 be-cause there were always seven objects in the array) We then multipliedp by the number of objects in the display in order to estimate the num-ber of objects remembered (Sperling 1960)

(Manuscript received March 9 2001revision accepted for publication November 28 2001)

892 IRWIN AND ZELINSKY

results of our experiment also indicate that scene repre-sentations are relatively sparse even after 15 fixationson a scene our subjects remembered the positionidentitypairings of only about 78 of the objects in our seven-object displays or the equivalent of about five objectsrsquoworth of information But more important our resultsprovide new information about what factors influencethe contents of the on-line scene representation and howthis representation changes during the course of sceneviewing In particular our analyses suggest that the lastthree items that are foveated and the item about to befoveated are actually remembered quite well much bet-ter than other items in a scene In other words on-linescene representations are dynamic so that the memorytraces of recently fixated objects are much stronger thanthose of objects viewed several fixations back in timeInformation about a scene does appear to accumulateover multiple fixations but the capacity of the on-linescene representation appears to be limited to about fiveitems In most respects the results of the present studyare consistent with the results of other studies that haveused the transsaccadic partial-report technique to inves-tigate integration across saccades with nonscene dis-plays (eg Irwin 1992 Irwin amp Andrews 1996 Irwinamp Gordon 1998)

What is the nature of the memory representation under-lying performance in our task Given that the same sevenobjects appeared in fixed positions on each trial onemight imagine that subjects would simply adopt a left-to-right reading strategy and would store an ordered list ofobject names in verbal working memory Then when theprobe was presented they would scan their memorizedlist and report the appropriate item Three aspects of ourdata seem inconsistent with this hypothesis howeverFirst as was noted earlier for the most part the subjectsdid not use a left-to-right reading strategy nor did theyeven adopt any consistent strategy across trials Thisvariability would greatly complicate memory retrieval ifthe subjects had stored an ordered list of object names inverbal working memory because there would be no con-sistent mapping between probe position and the positionof an item in verbal working memory That is if thememory probe appeared at Position 6 say this wouldcorrespond to the sixth position in the list if the subjectscanned the array from left to right the second positionin the list if the subject scanned from right to left thethird position if the subject started in the middle and thenmoved right and so on The fact that the subjects did notadopt a fixed scanning order suggests to us that theywere not storing an ordered list of object names in ver-bal working memory Second the fact that the subjectsremembered only a maximum of five objects from thedisplay also seems inconsistent with the verbal workingmemory hypothesis because the capacity of verbalworking memory is generally assumed to be betweenfive and nine items given 5 sec to study the display wewould think that the subjects would be able to remember

more than five items if they were using verbal workingmemory Finally the absence of a primacy effect in ourdata also seems inconsistent with the verbal workingmemory hypothesis

Instead we believe that our results are more consistentwith the object-f ile theory of transsaccadic memory(Irwin 1992 1996 Irwin amp Andrews 1996) This theoryis based on the theoretical framework for object percep-tion proposed by Kahneman and Treisman (1984) andKahneman Treisman and Gibbs (1992) It contains fourlevels of representation feature maps which register in-dependently the presence of different sensory features ina scene such as color and shape a master map of loca-tions which registers where in a scene features are lo-cated temporary object representations called objectfiles which are formed when features are conjoined intounitary wholes via attention and which thus representwhat objects are located where in a scene and an abstractlong-term recognition network that stores descriptions ofobjects along with their names Object files may containvisual verbal and semantic information about objectsbecause they have access to long-term memory as well asto perceptual input According to the object-file theoryof transsaccadic memory relatively little information ispreserved across eye movements Rather transsaccadicmemory (and by extension the on-line scene representa-tion) consists of 3mdash5 object files that contain positionand identity information for items that have been at-tended in the scene and of residual activation in the fea-ture maps and in the long-term recognition network Theresults of the present study are consistent with the object-file theory because even after 15 fixations on a scenememory for the scene was limited and consisted largelyof the last few objects that had been attended in the scene

Although the results of the present study are consis-tent with the object-file theory they seem not entirelyconsistent with another theory of scene representationproposed recently by Rensink (2000) According toRensinkrsquos coherence theory in the absence of focusedattention volatile low-level proto-objects are continuallybeing formed and replaced rapidly and in parallel acrossthe visual field reflecting whatever is present on theretina at any moment in time Focused attention providesa coherence field that selects a small number of theseproto-objects to form a stable individuated object thathas spatiotemporal continuity This stable object doesnot necessarily correspond to a single element in the vi-sual field however but may represent a supra-object (cfPylyshynrsquos FINSTs Pylyshyn amp Storm 1988) After fo-cused attention is released the object loses its coherenceand dissolves back into its constituent proto-object partsOnce that occurs there is little or no consequence of hav-ing been attended According to coherence theory visualshort-term memory corresponds to the coherence fieldit contains object tokens (ie specific instantiations at aparticular place and time) that are currently in the focusof attention but that disappear when attention is with-

EYE MOVEMENTS AND MEMORY 893

drawn The coherence theory does allow for the exis-tence of a nonvisual short-term memory for previouslyattended items but it posits that this memory containsinformation only about object types (general definitionsabstracted away from specific spatiotemporal contexts)

The results of the present study seem to raise problemsfor the coherence theory because object tokens were re-tained in memory for recently fixated objects that wereno longer in the focus of attention Consistent with co-herence theory we found that accuracy was very highwhen the object at the final saccade target location(hence the object at the focus of attention) was probedfor report However we also found that accuracy wasquite high for the three objects that had been fixated justprior to this Since attention movements precede eyemovements in an obligatory fashion the three objectsviewed in prior fixations could not also be at the focus ofattention The fact that accuracy was not uniform for thefour objects in question also indicates that they were notall simultaneously at the focus of attention Rather theobjects viewed during the fixations just preceding thefinal saccade benefited from prior attentional process-ing This is inconsistent with the claim that once focusedattention is released objects dissolve back into their con-stituent proto-objects with no consequence of havingbeen attended Note also that object tokens and not ob-ject types were being remembered in our experiment inorder to respond correctly to the partial-report probe thesubjects had to remember the position and the identityof the probed itemmdashthat is they had to remember an ob-ject token Merely remembering that an object type hadappeared somewhere in the display would not lead to anaccurate response especially since the same seven itemsappeared on every trial

Although the present study provides important infor-mation about the factors that influence the contents of on-line scene representations two limitations should benoted One is that our study measured only short-termmemory contributions to the on-line scene representationand not potential long-term memory contributions Thesame seven objects appeared in the same scene context onevery trial so top-down contributions to performancewere essentially eliminated In the real world scene rep-resentations are bound to be influenced by knowledgeabout what kinds of objects are typically found in a par-ticular scene context and where those objects are usuallylocated Indeed recent research by Hollingworth andHenderson (2002 see also Hollingworth et al 2001) inwhich a novel saccade-contingent change detection pro-cedure was used has demonstrated that long-term mem-ory plays a crucial role in the construction and mainte-nance of on-line scene representations

A second limitation of our study is that it involvedonly a test of explicit memory Recently several investi-gators have found evidence that explicit tests of scenememory may underestimate the amount of informationin the representation (eg Fernandez-Duque amp Thornton2000 Hollingworth et al 2001 Williams amp Simons

2000) For example Hollingworth et al found that chang-ing an object in a scene often affected fixation durationon that object even though subjects failed to report thata change had occurred This suggests that at some levelthe perceptual system detected the change even thoughthe subjects were not explicitly aware of it It is impor-tant to note that the implicit effects that have been ob-served have often been small however Furthermoreeven though implicit measures of performance (eg eyemovements) appear to be more sensitive to change thanare explicit measures it does not follow from this thatscene representations contain a great deal more infor-mation than has been measured by explicit tests Explicittests may underestimate the amount of information in thescene representation but there is no evidence that theyunderestimate it by much This is still an open question

REFERENCES

Aginsky V amp Tarr M (2000) How are different properties of a sceneencoded in visual memory Visual Cognition 7 147-162

Averbach E amp Coriell E (1961) Short-term memory in visionBell System Technical Journal 40 309-328

Blackmore S J Brelstaff G Nelson K amp Troscianko T(1995) Is the richness of our visual world an illusion Transsaccadicmemory for complex scenes Perception 24 1075-1081

Bridgeman B Hendry D amp Stark L (1975) Failure to detect dis-placement of the visual world during saccadic eye movements Vi-sion Research 15 719-722

Bridgeman B amp Mayer M (1983) Failure to integrate visual infor-mation from successive fixations Bulletin of the Psychonomic Soci-ety 21 285-286

Bridgeman B van der Heijden A amp Velichkovsky B (1994) Atheory of visual stability across saccadic eye movements Behavioralamp Brain Sciences 17 247-292

Busey T amp Loftus G (1994) Sensory and cognitive components ofvisual information acquisition Psychological Review 101 446-469

Currie C B McConkie G W Carlson-Radvansky L A ampIrwin D E (2000) The role of the saccade target object in the per-ception of a visually stable world Perception amp Psychophysics 62673-683

De Graef P (1992) Scene-context effects and models of real-worldperception In K Rayner (Ed) Eye movements and visual cognitionScene perception and reading (pp 243-259) New York Springer-Verlag

Deubel H amp Schneider W X (1996) Saccade target selection andobject recognition Evidence for a common attentional mechanismVision Research 36 1827-1837

Fernandez-Duque D amp Thornton I M (2000) Change detectionwithout awareness Do explicit reports underestimate the representa-tion of change in the visual system Visual Cognition 7 324-344

Grimes J (1996) On the failure to detect changes in scenes across sac-cades In K Akins (Ed) Perception (Vancouver Studies in CognitiveScience Vol 5 pp 89-110) New York Oxford University Press

Hayhoe M M (2000) Vision using routines A functional account ofvision Visual Cognition 7 43-64

Henderson J M (1993) Visual attention and saccadic eye move-ments In G drsquoYdewalle amp J Van Rensbergen (Eds) Perception andcognition Advances in eye-movement research (pp 37-50) Amster-dam North-Holland

Henderson J M (1997) Transsaccadic memory and integration dur-ing real-world object perception Psychological Science 8 51-55

Henderson J M amp Hollingworth A (1999a) High-level sceneperception Annual Review of Psychology 50 243-271

Henderson J M amp Hollingworth A (1999b) The role of fixationposition in detecting scene changes across saccades PsychologicalScience 10 438-443

894 IRWIN AND ZELINSKY

Henderson J M Pollatsek A amp Rayner K (1989) Covert visualattention and extrafoveal information use during object identifica-tion Perception amp Psychophysics 45 196-208

Hochberg J (1986) Representation of motion and space in video andcinematic displays In K R Boff L Kaufman amp J P Thomas (Eds)Handbook of perception and human performance Vol 1 Sensoryprocesses and perception (pp 221-2264) New York Wiley

Hoffman J E amp Subramaniam B (1995) The role of visual atten-tion in saccadic eye movements Perception amp Psychophysics 57787-795

Hollingworth A amp Henderson J M (2000) Semantic informa-tiveness mediates the detection of changes in natural scenes VisualCognition 7 213-235

Hollingworth A amp Henderson J M (2002) Accurate visualmemory for previously attended objects in natural scenes Journal ofExperimental Psychology Human Perception amp Performance 28113-136

Hollingworth A Williams C C amp Henderson J M (2001) Tosee and remember Visually specific information is retained in mem-ory from previously attended objects in natural scenes PsychonomicBulletin amp Review 8 761-768

Intraub H (1997) The representation of visual scenes Trends in Cog-nitive Sciences 1 217-222

Irwin D E (1991) Information integration across saccadic eye move-ments Cognitive Psychology 23 420-456

Irwin D E (1992) Memory for position and identity across eye move-ments Journal of Experimental Psychology Learning Memory ampCognition 18 307-317

Irwin D E (1996) Integrating information across saccadic eye move-ments Current Directions in Psychological Science 5 94-100

Irwin D E amp Andrews R V (1996) Integration and accumulationof information across saccadic eye movements In T Inui amp J L Mc-Clelland (Eds) Attention and performance XVI Information inte-gration in perception and communication (pp 125-155) CambridgeMA MIT Press

Irwin D E Brown J S amp Sun J-S (1988) Visual masking and vi-sual integration across saccadic eye movements Journal of Experi-mental Psychology General 117 276-287

Irwin D E amp Gordon R D (1998) Eye movements attention andtranssaccadic memory Visual Cognition 5 127-155

Irwin D E Yantis S amp Jonides J (1983) Evidence against visualintegration across saccadic eye movements Perception amp Psycho-physics 34 49-57

Irwin D E Zacks J L amp Brown J S (1990) Visual memory andthe perception of a stable visual environment Perception amp Psycho-physics 47 35-46

Kahneman D amp Treisman A (1984) Changing views of attentionand automaticity In R Parasuraman amp D R Davies (Eds) Varietiesof attention (pp 29-61) Orlando FL Academic Press

Kahneman D Treisman A amp Gibbs B J (1992) The reviewing ofobject f iles Object-specific integration of information CognitivePsychology 24 175-219

Klein R (1980) Does oculomotor readiness mediate cognitive controlof visual attention In R S Nickerson (Ed) Attention and perfor-mance VIII (pp 259-276) Hillsdale NJ Erlbaum

Klein R amp Pontefract A (1994) Does oculomotor readiness me-diate cognitive control of visual attention Revisited In C Umiltagrave ampM Moskovitch (Eds) Attention and performance XV Consciousand nonconscious information processing (pp 333-350) CambridgeMA MIT Press Bradford Books

Kowler E Anderson E Dosher B amp Blaser E (1995) The roleof attention in the programming of saccades Vision Research 351897-1916

Levin D T amp Simons D J (1997) Failure to detect changes to at-tended objects in motion pictures Psychonomic Bulletin amp Review4 501-506

Matin E (1974) Saccadic suppression A review and an analysis Psy-chological Bulletin 81 899-917

McConkie G W amp Currie C B (1996) Visual stability across sac-cades while viewing complex pictures Journal of Experimental Psy-chology Human Perception amp Performance 22 563-581

McConkie G W amp Zola D (1979) Is visual information integratedacross successive fixations in reading Perception amp Psychophysics25 221-224

Mewhort D J K Campbell A J Marchetti F M amp CampbellJ I D (1981) Identif ication localization and ldquoiconic memoryrdquo Anevaluation of the bar-probe task Memory amp Cognition 9 50-67

Mondy S amp Coltheart V (2000) Detection and identification ofchange in naturalistic scenes Visual Cognition 7 281-296

OrsquoRegan J K (1992) Solving the ldquorealrdquo mysteries of visual percep-tion The world as an outside memory Canadian Journal of Psy-chology 46 461-488

OrsquoRegan J K amp Levy-Schoen A (1983) Integrating visual infor-mation from successive fixations Does trans-saccadic fusion existVision Research 23 765-768

Pan K amp Eriksen C W (1993) Attentional distribution in the visualfield during samendashdifferent judgments as assessed by response com-petition Perception amp Psychophysics 53 134-144

Pashler H (1988) Familiarity and visual change detection Percep-tion amp Psychophysics 44 369-378

Phillips W A (1974) On the distinction between sensory storage andshort-term visual memory Perception amp Psychophysics 16 283-290

Pollatsek A Rayner K amp Henderson J (1990) The role of spa-tial location in the integration of pictorial information across sac-cades Journal of Experimental Psychology Human Perception ampPerformance 16 199-210

Pringle H L Irwin D E Kramer A F amp Atchley P (2001)The role of attentional breadth in perceptual change detection Psy-chonomic Bulletin amp Review 8 89-95

Pylyshyn Z W amp Storm R W (1988) Tracking multiple indepen-dent targets Evidence for a parallel tracking mechanism Spatial Vi-sion 3 179-197

Rayner K McConkie G W amp Ehrlich S (1978) Eye movement sand integrating information across f ixations Journal of Experimen-tal Psychology Human Perception amp Performance 4 529-544

Rayner K McConkie G W amp Zola D (1980) Integrating infor-mation across eye movements Cognitive Psychology 12 206-226

Rayner K amp Pollatsek A (1983) Is visual information integratedacross saccades Perception amp Psychophysics 34 39-48

Rensink R A (2000) The dynamic representation of scenes VisualCognition 7 17-42

Rensink R A OrsquoRegan J K amp Clark J J (1997) To see or not tosee The need for attention to perceive changes in scenes Psycho-logical Science 8 368-373

Shepherd M Findlay J amp Hockey R (1986) The relationship be-tween eye movements and spatial attention Quarterly Journal of Ex-perimental Psychology 38A 475-491

Simons D J (1996) In sight out of mind When object representa-tions fail Psychological Science 7 301-305

Simons D J (2000) Current approaches to change blindness VisualCognition 7 1-15

Simons D J amp Levin D T (1997) Change blindness Trends in Cog-nitive Sciences 1 261-267

Simons D J amp Levin D T (1998) Failure to detect changes to peo-ple during a real-world interaction Psychonomic Bulletin amp Review5 644-649

Sperling G (1960) The information available in brief visual presen-tations Psychological Monographs 74(11 Whole No 498)

Van Dijk T amp Kintsch W (1983) Strategies of discourse compre-hension New York Academic Press

Wallis G amp Bulthoff H (2000) Whatrsquos scene and not seen Influ-ences of movement and task upon what we see Visual Cognition 7175-190

Williams P amp Simons D J (2000) Detecting changes in novel 3Dobjects Effects of change magnitude spatiotemporal continuity andstimulus familiarity Visual Cognition 7 297-322

Zelinsky G J (1999) Precuing target location in a variable set sizeldquononsearchrdquo task Dissociating search-based and interference-basedexplanations for set size effects Journal of Experimental Psychol-ogy Human Perception amp Performance 25 875-903

Zelinsky G J (2001) Eye movements during change detection Im-

EYE MOVEMENTS AND MEMORY 895

plications for search constraints memory limitations and scanningstrategies Perception amp Psychophysics 63 209-225

Zelinsky G J amp Loschky L (1998) Toward a realistic assessmentof visual working memory Investigative Ophthalmology amp VisualScience 39 S224

Zelinsky G J Rao R Hayhoe M amp Ballard D (1997) Eyemovements reveal the spatiotemporal dynamics of visual search Psy-chological Science 8 448-453

NOTES

1 The percentage of f ixations summed across positions in Figure 6does not add up to 100 because some fixations did not fall on any ob-ject position This happened relatively rarely for Fixations 3ndash15 rang-ing from 3 to 9 of all the fixations It happened quite often for Fix-ation 2 (the first real f ixation on the scene since Fixation 1 was on thefixation point) accounting for 45 of all the fixations Approximately

67 of these fell near an object position whereas the remainder werescattered randomly in the central area of the scene these may have beenldquocenter of massrdquo fixations like those previously observed by ZelinskyRao Hayhoe and Ballard (1997)

2 In order to determine how many objects would have to be stored inmemory to produce an accuracy level of 80 we first to correct forguessing (Busey amp Loftus 1994) applied the formula p 5 (x 2 g)(1 2g) where x is the raw proportion correct g is the guessing probabilityand p is the corrected proportion correct (note that g 5 143 or 17 be-cause there were always seven objects in the array) We then multipliedp by the number of objects in the display in order to estimate the num-ber of objects remembered (Sperling 1960)

(Manuscript received March 9 2001revision accepted for publication November 28 2001)

EYE MOVEMENTS AND MEMORY 893

drawn The coherence theory does allow for the exis-tence of a nonvisual short-term memory for previouslyattended items but it posits that this memory containsinformation only about object types (general definitionsabstracted away from specific spatiotemporal contexts)

The results of the present study seem to raise problemsfor the coherence theory because object tokens were re-tained in memory for recently fixated objects that wereno longer in the focus of attention Consistent with co-herence theory we found that accuracy was very highwhen the object at the final saccade target location(hence the object at the focus of attention) was probedfor report However we also found that accuracy wasquite high for the three objects that had been fixated justprior to this Since attention movements precede eyemovements in an obligatory fashion the three objectsviewed in prior fixations could not also be at the focus ofattention The fact that accuracy was not uniform for thefour objects in question also indicates that they were notall simultaneously at the focus of attention Rather theobjects viewed during the fixations just preceding thefinal saccade benefited from prior attentional process-ing This is inconsistent with the claim that once focusedattention is released objects dissolve back into their con-stituent proto-objects with no consequence of havingbeen attended Note also that object tokens and not ob-ject types were being remembered in our experiment inorder to respond correctly to the partial-report probe thesubjects had to remember the position and the identityof the probed itemmdashthat is they had to remember an ob-ject token Merely remembering that an object type hadappeared somewhere in the display would not lead to anaccurate response especially since the same seven itemsappeared on every trial

Although the present study provides important infor-mation about the factors that influence the contents of on-line scene representations two limitations should benoted One is that our study measured only short-termmemory contributions to the on-line scene representationand not potential long-term memory contributions Thesame seven objects appeared in the same scene context onevery trial so top-down contributions to performancewere essentially eliminated In the real world scene rep-resentations are bound to be influenced by knowledgeabout what kinds of objects are typically found in a par-ticular scene context and where those objects are usuallylocated Indeed recent research by Hollingworth andHenderson (2002 see also Hollingworth et al 2001) inwhich a novel saccade-contingent change detection pro-cedure was used has demonstrated that long-term mem-ory plays a crucial role in the construction and mainte-nance of on-line scene representations

A second limitation of our study is that it involvedonly a test of explicit memory Recently several investi-gators have found evidence that explicit tests of scenememory may underestimate the amount of informationin the representation (eg Fernandez-Duque amp Thornton2000 Hollingworth et al 2001 Williams amp Simons

2000) For example Hollingworth et al found that chang-ing an object in a scene often affected fixation durationon that object even though subjects failed to report thata change had occurred This suggests that at some levelthe perceptual system detected the change even thoughthe subjects were not explicitly aware of it It is impor-tant to note that the implicit effects that have been ob-served have often been small however Furthermoreeven though implicit measures of performance (eg eyemovements) appear to be more sensitive to change thanare explicit measures it does not follow from this thatscene representations contain a great deal more infor-mation than has been measured by explicit tests Explicittests may underestimate the amount of information in thescene representation but there is no evidence that theyunderestimate it by much This is still an open question

REFERENCES

Aginsky V amp Tarr M (2000) How are different properties of a sceneencoded in visual memory Visual Cognition 7 147-162

Averbach E amp Coriell E (1961) Short-term memory in visionBell System Technical Journal 40 309-328

Blackmore S J Brelstaff G Nelson K amp Troscianko T(1995) Is the richness of our visual world an illusion Transsaccadicmemory for complex scenes Perception 24 1075-1081

Bridgeman B Hendry D amp Stark L (1975) Failure to detect dis-placement of the visual world during saccadic eye movements Vi-sion Research 15 719-722

Bridgeman B amp Mayer M (1983) Failure to integrate visual infor-mation from successive fixations Bulletin of the Psychonomic Soci-ety 21 285-286

Bridgeman B van der Heijden A amp Velichkovsky B (1994) Atheory of visual stability across saccadic eye movements Behavioralamp Brain Sciences 17 247-292

Busey T amp Loftus G (1994) Sensory and cognitive components ofvisual information acquisition Psychological Review 101 446-469

Currie C B McConkie G W Carlson-Radvansky L A ampIrwin D E (2000) The role of the saccade target object in the per-ception of a visually stable world Perception amp Psychophysics 62673-683

De Graef P (1992) Scene-context effects and models of real-worldperception In K Rayner (Ed) Eye movements and visual cognitionScene perception and reading (pp 243-259) New York Springer-Verlag

Deubel H amp Schneider W X (1996) Saccade target selection andobject recognition Evidence for a common attentional mechanismVision Research 36 1827-1837

Fernandez-Duque D amp Thornton I M (2000) Change detectionwithout awareness Do explicit reports underestimate the representa-tion of change in the visual system Visual Cognition 7 324-344

Grimes J (1996) On the failure to detect changes in scenes across sac-cades In K Akins (Ed) Perception (Vancouver Studies in CognitiveScience Vol 5 pp 89-110) New York Oxford University Press

Hayhoe M M (2000) Vision using routines A functional account ofvision Visual Cognition 7 43-64

Henderson J M (1993) Visual attention and saccadic eye move-ments In G drsquoYdewalle amp J Van Rensbergen (Eds) Perception andcognition Advances in eye-movement research (pp 37-50) Amster-dam North-Holland

Henderson J M (1997) Transsaccadic memory and integration dur-ing real-world object perception Psychological Science 8 51-55

Henderson J M amp Hollingworth A (1999a) High-level sceneperception Annual Review of Psychology 50 243-271

Henderson J M amp Hollingworth A (1999b) The role of fixationposition in detecting scene changes across saccades PsychologicalScience 10 438-443

894 IRWIN AND ZELINSKY

Henderson J M Pollatsek A amp Rayner K (1989) Covert visualattention and extrafoveal information use during object identifica-tion Perception amp Psychophysics 45 196-208

Hochberg J (1986) Representation of motion and space in video andcinematic displays In K R Boff L Kaufman amp J P Thomas (Eds)Handbook of perception and human performance Vol 1 Sensoryprocesses and perception (pp 221-2264) New York Wiley

Hoffman J E amp Subramaniam B (1995) The role of visual atten-tion in saccadic eye movements Perception amp Psychophysics 57787-795

Hollingworth A amp Henderson J M (2000) Semantic informa-tiveness mediates the detection of changes in natural scenes VisualCognition 7 213-235

Hollingworth A amp Henderson J M (2002) Accurate visualmemory for previously attended objects in natural scenes Journal ofExperimental Psychology Human Perception amp Performance 28113-136

Hollingworth A Williams C C amp Henderson J M (2001) Tosee and remember Visually specific information is retained in mem-ory from previously attended objects in natural scenes PsychonomicBulletin amp Review 8 761-768

Intraub H (1997) The representation of visual scenes Trends in Cog-nitive Sciences 1 217-222

Irwin D E (1991) Information integration across saccadic eye move-ments Cognitive Psychology 23 420-456

Irwin D E (1992) Memory for position and identity across eye move-ments Journal of Experimental Psychology Learning Memory ampCognition 18 307-317

Irwin D E (1996) Integrating information across saccadic eye move-ments Current Directions in Psychological Science 5 94-100

Irwin D E amp Andrews R V (1996) Integration and accumulationof information across saccadic eye movements In T Inui amp J L Mc-Clelland (Eds) Attention and performance XVI Information inte-gration in perception and communication (pp 125-155) CambridgeMA MIT Press

Irwin D E Brown J S amp Sun J-S (1988) Visual masking and vi-sual integration across saccadic eye movements Journal of Experi-mental Psychology General 117 276-287

Irwin D E amp Gordon R D (1998) Eye movements attention andtranssaccadic memory Visual Cognition 5 127-155

Irwin D E Yantis S amp Jonides J (1983) Evidence against visualintegration across saccadic eye movements Perception amp Psycho-physics 34 49-57

Irwin D E Zacks J L amp Brown J S (1990) Visual memory andthe perception of a stable visual environment Perception amp Psycho-physics 47 35-46

Kahneman D amp Treisman A (1984) Changing views of attentionand automaticity In R Parasuraman amp D R Davies (Eds) Varietiesof attention (pp 29-61) Orlando FL Academic Press

Kahneman D Treisman A amp Gibbs B J (1992) The reviewing ofobject f iles Object-specific integration of information CognitivePsychology 24 175-219

Klein R (1980) Does oculomotor readiness mediate cognitive controlof visual attention In R S Nickerson (Ed) Attention and perfor-mance VIII (pp 259-276) Hillsdale NJ Erlbaum

Klein R amp Pontefract A (1994) Does oculomotor readiness me-diate cognitive control of visual attention Revisited In C Umiltagrave ampM Moskovitch (Eds) Attention and performance XV Consciousand nonconscious information processing (pp 333-350) CambridgeMA MIT Press Bradford Books

Kowler E Anderson E Dosher B amp Blaser E (1995) The roleof attention in the programming of saccades Vision Research 351897-1916

Levin D T amp Simons D J (1997) Failure to detect changes to at-tended objects in motion pictures Psychonomic Bulletin amp Review4 501-506

Matin E (1974) Saccadic suppression A review and an analysis Psy-chological Bulletin 81 899-917

McConkie G W amp Currie C B (1996) Visual stability across sac-cades while viewing complex pictures Journal of Experimental Psy-chology Human Perception amp Performance 22 563-581

McConkie G W amp Zola D (1979) Is visual information integratedacross successive fixations in reading Perception amp Psychophysics25 221-224

Mewhort D J K Campbell A J Marchetti F M amp CampbellJ I D (1981) Identif ication localization and ldquoiconic memoryrdquo Anevaluation of the bar-probe task Memory amp Cognition 9 50-67

Mondy S amp Coltheart V (2000) Detection and identification ofchange in naturalistic scenes Visual Cognition 7 281-296

OrsquoRegan J K (1992) Solving the ldquorealrdquo mysteries of visual percep-tion The world as an outside memory Canadian Journal of Psy-chology 46 461-488

OrsquoRegan J K amp Levy-Schoen A (1983) Integrating visual infor-mation from successive fixations Does trans-saccadic fusion existVision Research 23 765-768

Pan K amp Eriksen C W (1993) Attentional distribution in the visualfield during samendashdifferent judgments as assessed by response com-petition Perception amp Psychophysics 53 134-144

Pashler H (1988) Familiarity and visual change detection Percep-tion amp Psychophysics 44 369-378

Phillips W A (1974) On the distinction between sensory storage andshort-term visual memory Perception amp Psychophysics 16 283-290

Pollatsek A Rayner K amp Henderson J (1990) The role of spa-tial location in the integration of pictorial information across sac-cades Journal of Experimental Psychology Human Perception ampPerformance 16 199-210

Pringle H L Irwin D E Kramer A F amp Atchley P (2001)The role of attentional breadth in perceptual change detection Psy-chonomic Bulletin amp Review 8 89-95

Pylyshyn Z W amp Storm R W (1988) Tracking multiple indepen-dent targets Evidence for a parallel tracking mechanism Spatial Vi-sion 3 179-197

Rayner K McConkie G W amp Ehrlich S (1978) Eye movement sand integrating information across f ixations Journal of Experimen-tal Psychology Human Perception amp Performance 4 529-544

Rayner K McConkie G W amp Zola D (1980) Integrating infor-mation across eye movements Cognitive Psychology 12 206-226

Rayner K amp Pollatsek A (1983) Is visual information integratedacross saccades Perception amp Psychophysics 34 39-48

Rensink R A (2000) The dynamic representation of scenes VisualCognition 7 17-42

Rensink R A OrsquoRegan J K amp Clark J J (1997) To see or not tosee The need for attention to perceive changes in scenes Psycho-logical Science 8 368-373

Shepherd M Findlay J amp Hockey R (1986) The relationship be-tween eye movements and spatial attention Quarterly Journal of Ex-perimental Psychology 38A 475-491

Simons D J (1996) In sight out of mind When object representa-tions fail Psychological Science 7 301-305

Simons D J (2000) Current approaches to change blindness VisualCognition 7 1-15

Simons D J amp Levin D T (1997) Change blindness Trends in Cog-nitive Sciences 1 261-267

Simons D J amp Levin D T (1998) Failure to detect changes to peo-ple during a real-world interaction Psychonomic Bulletin amp Review5 644-649

Sperling G (1960) The information available in brief visual presen-tations Psychological Monographs 74(11 Whole No 498)

Van Dijk T amp Kintsch W (1983) Strategies of discourse compre-hension New York Academic Press

Wallis G amp Bulthoff H (2000) Whatrsquos scene and not seen Influ-ences of movement and task upon what we see Visual Cognition 7175-190

Williams P amp Simons D J (2000) Detecting changes in novel 3Dobjects Effects of change magnitude spatiotemporal continuity andstimulus familiarity Visual Cognition 7 297-322

Zelinsky G J (1999) Precuing target location in a variable set sizeldquononsearchrdquo task Dissociating search-based and interference-basedexplanations for set size effects Journal of Experimental Psychol-ogy Human Perception amp Performance 25 875-903

Zelinsky G J (2001) Eye movements during change detection Im-

EYE MOVEMENTS AND MEMORY 895

plications for search constraints memory limitations and scanningstrategies Perception amp Psychophysics 63 209-225

Zelinsky G J amp Loschky L (1998) Toward a realistic assessmentof visual working memory Investigative Ophthalmology amp VisualScience 39 S224

Zelinsky G J Rao R Hayhoe M amp Ballard D (1997) Eyemovements reveal the spatiotemporal dynamics of visual search Psy-chological Science 8 448-453

NOTES

1 The percentage of f ixations summed across positions in Figure 6does not add up to 100 because some fixations did not fall on any ob-ject position This happened relatively rarely for Fixations 3ndash15 rang-ing from 3 to 9 of all the fixations It happened quite often for Fix-ation 2 (the first real f ixation on the scene since Fixation 1 was on thefixation point) accounting for 45 of all the fixations Approximately

67 of these fell near an object position whereas the remainder werescattered randomly in the central area of the scene these may have beenldquocenter of massrdquo fixations like those previously observed by ZelinskyRao Hayhoe and Ballard (1997)

2 In order to determine how many objects would have to be stored inmemory to produce an accuracy level of 80 we first to correct forguessing (Busey amp Loftus 1994) applied the formula p 5 (x 2 g)(1 2g) where x is the raw proportion correct g is the guessing probabilityand p is the corrected proportion correct (note that g 5 143 or 17 be-cause there were always seven objects in the array) We then multipliedp by the number of objects in the display in order to estimate the num-ber of objects remembered (Sperling 1960)

(Manuscript received March 9 2001revision accepted for publication November 28 2001)

894 IRWIN AND ZELINSKY

Henderson J M Pollatsek A amp Rayner K (1989) Covert visualattention and extrafoveal information use during object identifica-tion Perception amp Psychophysics 45 196-208

Hochberg J (1986) Representation of motion and space in video andcinematic displays In K R Boff L Kaufman amp J P Thomas (Eds)Handbook of perception and human performance Vol 1 Sensoryprocesses and perception (pp 221-2264) New York Wiley

Hoffman J E amp Subramaniam B (1995) The role of visual atten-tion in saccadic eye movements Perception amp Psychophysics 57787-795

Hollingworth A amp Henderson J M (2000) Semantic informa-tiveness mediates the detection of changes in natural scenes VisualCognition 7 213-235

Hollingworth A amp Henderson J M (2002) Accurate visualmemory for previously attended objects in natural scenes Journal ofExperimental Psychology Human Perception amp Performance 28113-136

Hollingworth A Williams C C amp Henderson J M (2001) Tosee and remember Visually specific information is retained in mem-ory from previously attended objects in natural scenes PsychonomicBulletin amp Review 8 761-768

Intraub H (1997) The representation of visual scenes Trends in Cog-nitive Sciences 1 217-222

Irwin D E (1991) Information integration across saccadic eye move-ments Cognitive Psychology 23 420-456

Irwin D E (1992) Memory for position and identity across eye move-ments Journal of Experimental Psychology Learning Memory ampCognition 18 307-317

Irwin D E (1996) Integrating information across saccadic eye move-ments Current Directions in Psychological Science 5 94-100

Irwin D E amp Andrews R V (1996) Integration and accumulationof information across saccadic eye movements In T Inui amp J L Mc-Clelland (Eds) Attention and performance XVI Information inte-gration in perception and communication (pp 125-155) CambridgeMA MIT Press

Irwin D E Brown J S amp Sun J-S (1988) Visual masking and vi-sual integration across saccadic eye movements Journal of Experi-mental Psychology General 117 276-287

Irwin D E amp Gordon R D (1998) Eye movements attention andtranssaccadic memory Visual Cognition 5 127-155

Irwin D E Yantis S amp Jonides J (1983) Evidence against visualintegration across saccadic eye movements Perception amp Psycho-physics 34 49-57

Irwin D E Zacks J L amp Brown J S (1990) Visual memory andthe perception of a stable visual environment Perception amp Psycho-physics 47 35-46

Kahneman D amp Treisman A (1984) Changing views of attentionand automaticity In R Parasuraman amp D R Davies (Eds) Varietiesof attention (pp 29-61) Orlando FL Academic Press

Kahneman D Treisman A amp Gibbs B J (1992) The reviewing ofobject f iles Object-specific integration of information CognitivePsychology 24 175-219

Klein R (1980) Does oculomotor readiness mediate cognitive controlof visual attention In R S Nickerson (Ed) Attention and perfor-mance VIII (pp 259-276) Hillsdale NJ Erlbaum

Klein R amp Pontefract A (1994) Does oculomotor readiness me-diate cognitive control of visual attention Revisited In C Umiltagrave ampM Moskovitch (Eds) Attention and performance XV Consciousand nonconscious information processing (pp 333-350) CambridgeMA MIT Press Bradford Books

Kowler E Anderson E Dosher B amp Blaser E (1995) The roleof attention in the programming of saccades Vision Research 351897-1916

Levin D T amp Simons D J (1997) Failure to detect changes to at-tended objects in motion pictures Psychonomic Bulletin amp Review4 501-506

Matin E (1974) Saccadic suppression A review and an analysis Psy-chological Bulletin 81 899-917

McConkie G W amp Currie C B (1996) Visual stability across sac-cades while viewing complex pictures Journal of Experimental Psy-chology Human Perception amp Performance 22 563-581

McConkie G W amp Zola D (1979) Is visual information integratedacross successive fixations in reading Perception amp Psychophysics25 221-224

Mewhort D J K Campbell A J Marchetti F M amp CampbellJ I D (1981) Identif ication localization and ldquoiconic memoryrdquo Anevaluation of the bar-probe task Memory amp Cognition 9 50-67

Mondy S amp Coltheart V (2000) Detection and identification ofchange in naturalistic scenes Visual Cognition 7 281-296

OrsquoRegan J K (1992) Solving the ldquorealrdquo mysteries of visual percep-tion The world as an outside memory Canadian Journal of Psy-chology 46 461-488

OrsquoRegan J K amp Levy-Schoen A (1983) Integrating visual infor-mation from successive fixations Does trans-saccadic fusion existVision Research 23 765-768

Pan K amp Eriksen C W (1993) Attentional distribution in the visualfield during samendashdifferent judgments as assessed by response com-petition Perception amp Psychophysics 53 134-144

Pashler H (1988) Familiarity and visual change detection Percep-tion amp Psychophysics 44 369-378

Phillips W A (1974) On the distinction between sensory storage andshort-term visual memory Perception amp Psychophysics 16 283-290

Pollatsek A Rayner K amp Henderson J (1990) The role of spa-tial location in the integration of pictorial information across sac-cades Journal of Experimental Psychology Human Perception ampPerformance 16 199-210

Pringle H L Irwin D E Kramer A F amp Atchley P (2001)The role of attentional breadth in perceptual change detection Psy-chonomic Bulletin amp Review 8 89-95

Pylyshyn Z W amp Storm R W (1988) Tracking multiple indepen-dent targets Evidence for a parallel tracking mechanism Spatial Vi-sion 3 179-197

Rayner K McConkie G W amp Ehrlich S (1978) Eye movement sand integrating information across f ixations Journal of Experimen-tal Psychology Human Perception amp Performance 4 529-544

Rayner K McConkie G W amp Zola D (1980) Integrating infor-mation across eye movements Cognitive Psychology 12 206-226

Rayner K amp Pollatsek A (1983) Is visual information integratedacross saccades Perception amp Psychophysics 34 39-48

Rensink R A (2000) The dynamic representation of scenes VisualCognition 7 17-42

Rensink R A OrsquoRegan J K amp Clark J J (1997) To see or not tosee The need for attention to perceive changes in scenes Psycho-logical Science 8 368-373

Shepherd M Findlay J amp Hockey R (1986) The relationship be-tween eye movements and spatial attention Quarterly Journal of Ex-perimental Psychology 38A 475-491

Simons D J (1996) In sight out of mind When object representa-tions fail Psychological Science 7 301-305

Simons D J (2000) Current approaches to change blindness VisualCognition 7 1-15

Simons D J amp Levin D T (1997) Change blindness Trends in Cog-nitive Sciences 1 261-267

Simons D J amp Levin D T (1998) Failure to detect changes to peo-ple during a real-world interaction Psychonomic Bulletin amp Review5 644-649

Sperling G (1960) The information available in brief visual presen-tations Psychological Monographs 74(11 Whole No 498)

Van Dijk T amp Kintsch W (1983) Strategies of discourse compre-hension New York Academic Press

Wallis G amp Bulthoff H (2000) Whatrsquos scene and not seen Influ-ences of movement and task upon what we see Visual Cognition 7175-190

Williams P amp Simons D J (2000) Detecting changes in novel 3Dobjects Effects of change magnitude spatiotemporal continuity andstimulus familiarity Visual Cognition 7 297-322

Zelinsky G J (1999) Precuing target location in a variable set sizeldquononsearchrdquo task Dissociating search-based and interference-basedexplanations for set size effects Journal of Experimental Psychol-ogy Human Perception amp Performance 25 875-903

Zelinsky G J (2001) Eye movements during change detection Im-

EYE MOVEMENTS AND MEMORY 895

plications for search constraints memory limitations and scanningstrategies Perception amp Psychophysics 63 209-225

Zelinsky G J amp Loschky L (1998) Toward a realistic assessmentof visual working memory Investigative Ophthalmology amp VisualScience 39 S224

Zelinsky G J Rao R Hayhoe M amp Ballard D (1997) Eyemovements reveal the spatiotemporal dynamics of visual search Psy-chological Science 8 448-453

NOTES

1 The percentage of f ixations summed across positions in Figure 6does not add up to 100 because some fixations did not fall on any ob-ject position This happened relatively rarely for Fixations 3ndash15 rang-ing from 3 to 9 of all the fixations It happened quite often for Fix-ation 2 (the first real f ixation on the scene since Fixation 1 was on thefixation point) accounting for 45 of all the fixations Approximately

67 of these fell near an object position whereas the remainder werescattered randomly in the central area of the scene these may have beenldquocenter of massrdquo fixations like those previously observed by ZelinskyRao Hayhoe and Ballard (1997)

2 In order to determine how many objects would have to be stored inmemory to produce an accuracy level of 80 we first to correct forguessing (Busey amp Loftus 1994) applied the formula p 5 (x 2 g)(1 2g) where x is the raw proportion correct g is the guessing probabilityand p is the corrected proportion correct (note that g 5 143 or 17 be-cause there were always seven objects in the array) We then multipliedp by the number of objects in the display in order to estimate the num-ber of objects remembered (Sperling 1960)

(Manuscript received March 9 2001revision accepted for publication November 28 2001)

EYE MOVEMENTS AND MEMORY 895

plications for search constraints memory limitations and scanningstrategies Perception amp Psychophysics 63 209-225

Zelinsky G J amp Loschky L (1998) Toward a realistic assessmentof visual working memory Investigative Ophthalmology amp VisualScience 39 S224

Zelinsky G J Rao R Hayhoe M amp Ballard D (1997) Eyemovements reveal the spatiotemporal dynamics of visual search Psy-chological Science 8 448-453

NOTES

1 The percentage of f ixations summed across positions in Figure 6does not add up to 100 because some fixations did not fall on any ob-ject position This happened relatively rarely for Fixations 3ndash15 rang-ing from 3 to 9 of all the fixations It happened quite often for Fix-ation 2 (the first real f ixation on the scene since Fixation 1 was on thefixation point) accounting for 45 of all the fixations Approximately

67 of these fell near an object position whereas the remainder werescattered randomly in the central area of the scene these may have beenldquocenter of massrdquo fixations like those previously observed by ZelinskyRao Hayhoe and Ballard (1997)

2 In order to determine how many objects would have to be stored inmemory to produce an accuracy level of 80 we first to correct forguessing (Busey amp Loftus 1994) applied the formula p 5 (x 2 g)(1 2g) where x is the raw proportion correct g is the guessing probabilityand p is the corrected proportion correct (note that g 5 143 or 17 be-cause there were always seven objects in the array) We then multipliedp by the number of objects in the display in order to estimate the num-ber of objects remembered (Sperling 1960)

(Manuscript received March 9 2001revision accepted for publication November 28 2001)