CHAPTER 3. VISUAL ATTENTION AND VISUAL AWARENESS.

28
1 Section I. Anatomy and physiology of the human visual system. CHAPTER 3. VISUAL ATTENTION AND VISUAL AWARENESS. Rufin VanRullen 1 and Christof Koch 2 . 1. CNRS Centre de Recherche Cerveau et Cognition 133 Route de Narbonne 31062 Toulouse Cedex (France) 2. California Institute of Technology Division of Biology and Division of Engineering and Applied Sciences MC 139-74 Pasadena CA 91125 (USA)

Transcript of CHAPTER 3. VISUAL ATTENTION AND VISUAL AWARENESS.

Page 1: CHAPTER 3. VISUAL ATTENTION AND VISUAL AWARENESS.

1

Section I. Anatomy and physiology of the human visual system.

CHAPTER 3. VISUAL ATTENTION AND VISUAL AWARENESS.

Rufin VanRullen1 and Christof Koch2. 1. CNRS

Centre de Recherche Cerveau et Cognition 133 Route de Narbonne 31062 Toulouse Cedex (France) 2. California Institute of Technology Division of Biology and

Division of Engineering and Applied Sciences MC 139-74

Pasadena CA 91125 (USA)

Page 2: CHAPTER 3. VISUAL ATTENTION AND VISUAL AWARENESS.

2

Intuitively, vision appears an overall easy process, effortless, almost automatic, and so efficient that a simple glance at a complex scene is sufficient to produce immediate awareness of its entire structure and elements. Unfortunately, much of this is a grand illusion (O’Regan & Noe, 2001). Proper manipulations will reveal that many often essential aspects of the visual scene can go purely unnoticed. For example, human subjects will often fail to notice an unexpected but quite large stimulus flashed right at the center of gaze during a psychophysical experiment (a phenomenon known as “inattentional blindness”, Mack and Rock, 1998). In more natural environments, observers can fail to notice the appearance or disappearance of a large object (Fig 3.1; Rensink et al, 1997; O’Regan et al, 1999), the change in identity of the person they are conversing with (Simons and Levin, 1998), or the passage of a gorilla in the middle of a ball game (Simons and Chabris, 1999). As a group, these visual failures are referred to as “change blindness” (Rensink, 2002). Such limitations are rarely directly experienced in real life –except as one of the main instrument of magician tricks; yet they shape much of our visual perceptions.

Fig 3.1. An example of change blindness. The two pictures, differing by an aspect unknown in advance (change of size, color, position, or even disappearance of an

object, as shown here), are presented successively, always separated by a blank

frame. This is repeated in a cycle. Typically, an observer requires many repetitions in order to notice the change, even when this change is substantial. These peculiar phenomena reflect in fact the limited capacity of an ingredient

essential to much of visual perception: attention. While the retina potentially embraces the entire scene, attention can only focus on one or a few elements at a time, and thus facilitate their perception, their recognition, or their memorization for later recall. This is not to say that perception cannot exist outside the focus of attention, as will be seen in

Page 3: CHAPTER 3. VISUAL ATTENTION AND VISUAL AWARENESS.

3

the last section. Motion-induced blindness (Bonneh et al, 2001), flash-suppression (Wolfe, 1984) and binocular rivalry (Blake and Logothetis, 2002) are other examples of visual phenomena where the withdrawal of focal attention is likely to be critical.

Before addressing the nature and the role of attention, it is equally important to understand what can be done in the absence of attention. This depends, in part, on the overall structure and organization of visual cortex.

Fig 3.2. Visual cortical hierarchy. At least two functional streams can be identified as being emitted from primary visual cortex (V1). The ventral “what” pathway runs

through V4 into infero-temporal cortex (right), while the dorsal “where” pathway

comprises areas V3, MT and MST, ending within parietal cortex (left). Adapted from Felleman and Van Essen (1991).

Page 4: CHAPTER 3. VISUAL ATTENTION AND VISUAL AWARENESS.

4

3.1. CORTICAL HIERARCHIES AND PROCESSING STREAMS 3.1.1. Hierarchical organization. As detailed in the previous chapter, the three dozen cortical areas that constitute visual cortex are not randomly interconnected but display a specific pattern of organization. The laminar distribution of cortical projection neurons and axonal termination zones permits the observant neuro-anatomist to define forward, feedback and sideway cortico-cortical connections (Rockland & Pandya, 1979; Bullier et al, 1984). In visual cortex, each area can thus be assigned a position within a hierarchy comprising at least a dozen levels (Felleman and Van Essen, 1991; Van Essen et al, 1992). Functionally, in the ventral pathway, the hierarchy (which is non-unique) is best described as a sequence of feature-selective neuronal populations of increasing complexity (Barlow, 1972); each level “explicitly” represents a particular feature dimension (e.g. color, orientation), with high-level concepts and categories being “explicitly” represented in higher-level areas (e.g. inferior and medial temporal cortex). By “explicit”, we mean that the firing of a certain population of neurons can be directly related to the presence of this aspect or element within the visual scene. For example, direction of movement can be explicitly represented by certain neurons or cortical columns in area MT (Newsome et al, 1989). Patients with lesions in and around MT can show a selective loss for the perception of movement (Zihl et al, 1983). One can say that MT constitutes an “essential node” (Zeki, 2001) for direction of movement. There is probably a direct relation between the clinical concept of “essential node” and the neurophysiological concept of “explicit coding” (e.g. columnar representation). We believe that both concepts will prove to be very useful to describe neuronal coding and representation. To better understand the primate cortical visual system, whose organization is rather complex (Fig 3.2.), it is convenient to separate it in two distinct functional streams.

3.1.2. What and Where pathways It was primarily on the basis of lesion studies in macaque monkeys that Leslie

Ungerleider and Mortimer Mishkin (Ungerleider & Mishkin, 1982) arrived at the conclusion that the visual system comprised two ensembles of cortical areas with complementary functions (Morel & Bullier, 1990). These experiments in macaques were informed by various neurological deficits observed in humans following specific lesions: impairments in the perception of space (e.g. neglect) after lesions of parietal cortex (Driver and Mattingley, 1998), and impairments in color (achromatopsia) or shape (agnosia) perception following lesions of temporal lobe areas (Humphreys & Riddoch, 1987). Upon lesioning ventral areas of the temporal lobe, Ungerleider and Mishkin (1982) observed that monkeys could find and manipulate objects but not discriminate between them on the basis of their shape; lesions of dorsal areas of the parietal lobe yielded the opposite pattern of results: shape discrimination was preserved, but spatial processing was greatly impaired. They postulated that the “ventral” stream was primarily concerned with “what”-like information (i.e., the identity of objects in the scene), while the second “dorsal” stream had to do with “where”-like information (i.e. the spatial location and

Page 5: CHAPTER 3. VISUAL ATTENTION AND VISUAL AWARENESS.

5

movement of objects). Since then, a similar distinction has been demonstrated in humans using PET and fMRI techniques (Ungerleider & Haxby, 1994). Nevertheless, the separation between these two streams is not absolute. First, there are significant connections between areas of the ventral and dorsal streams (Morel and Bullier, 1990; Baizer et al, 1991). Second, there exist cortical areas with an intermediate position between the temporal and parietal lobes (in particular around the superior temporal sulcus), that cannot be easily classified (e.g. Karnath, 2001).

The case of patient D.F., who suffered from diffuse bilateral lesions affecting lateral extra-striate areas 18 and 19 (preventing ventral, but not dorsal pathway activation in this patient), initiated a different, though non-exclusive interpretation of this dichotomy. D.F. could not report the orientation (e.g. horizontal, vertical) of a slot made on the front of a box; yet when asked to “post” her hand through the slot, she would move it and orient it in perfect accordance with the orientation that she “could not” perceive. It seemed as if the patient’s visually guided movements could make use of information that the subject was not explicitly aware of. Milner and Goodale (1995) proposed that the correct distinction was in fact between a “what” pathway for perception (ventral) and a “how” pathway for action (dorsal): the latter could access limited shape information, but not deliver it to the observer’s awareness.

Deriving from these theories, the “what” ventral stream has been associated with the contents of consciousness. The “where” dorsal stream is thought to be involved in spatial cognition, and in particular the guidance of eye and attentional mechanisms, as will be described in section 3.4. It is within the ventral stream that the hierarchical “feature extraction” functional organization is the most apparent. Among other things, neurons in V1 extract information about the orientation of bars and edges (Hubel & Wiesel, 1968); neurons in V2 respond to illusory contours in addition to real ones (Von der Heydt et al, 1984); in V4, to simple geometric patterns and shapes (Gallant et al, 1993; Ghose & T’so, 1997); in posterior infero-temporal cortex, to common object parts or features (Tanaka, 1996); in anterior infero-temporal cortex, to more complex categories of objects such as faces or animals (Perrett et al, 1982; Logothetis & Sheinberg, 1996; Vogels, 1999). The human fusiform gyrus is the homologue of monkey infero-temporal cortex, with strong face-selective responses obtained in electrophysiological recordings in the monkey (Allison et al, 1999) and fMRI in humans (Kanwisher et al, 1997), as well as responses to other types of objects (although this point is the subject of much ongoing debate; Chao et al, 1999; Gauthier et al, 2000; Tarr & Cheng, 2003). In the human medial temporal lobe (MTL), one step higher in the cortical hierarchy, electrophysiological recordings on epileptic patients have revealed that single neurons are able to respond selectively to individual images, celebrities and natural categories such as animals or cars (Kreiman et al, 2000). It seems evident that, computationally, neurons at a given stage in the hierarchy can build their selectivity by pooling together the outputs of neurons selective to more simple features at preceding levels. This powerful “feed-forward” representation scheme is applied rather successfully by many state-of-the-art object recognition neural network models (e.g. Fukushima & Miyake, 1982; Riesenhuber & Poggio, 1999).

Page 6: CHAPTER 3. VISUAL ATTENTION AND VISUAL AWARENESS.

6

To conclude, the hierarchical organization of the ventral visual pathway is compatible with the fact that some forms of object recognition can be performed both effortlessly and rapidly: a single wave of activation propagating through the system can be sufficient to activate highly specific neurons with complex selectivities (VanRullen & Thorpe, 2002). Accordingly, some high-level categorization tasks (e.g. animal vs. non-animal) can be performed in as little as 150 ms (as assayed by EEG in humans; Thorpe et al, 1996; VanRullen & Thorpe, 2001), i.e. just as long as it takes for information to race through the system and initiate firing in the relevant selective neurons. Additionally, because this “feed-forward” hierarchical representation strategy relies on “hardwired” (but potentially learned) connections, it does not take much effort from the observer to activate the higher-level neurons: in fact, the first experiments to record face-selective responses in monkey IT cortex were performed on anaesthetized animals (e.g. Gross et al, 1972)!

3.1.3. Attention and the limits of feed-forward processing What is the role of attention in this type of hierarchical architecture? Is attention

necessary for activating neurons in the highest levels, for example those that are selective to complex categories of objects? This would be a simple way to account for the failures of perception observed in change blindness: since attention has limited capacity, only a few objects at a time (those selected by attention) could be fully processed. However, recent psychophysical results indicate that this is not always the case (Li et al, 2002; Rousselet et al, 2002). Natural scenes containing animals or vehicles can be categorized even when focal, top-down attention is occupied elsewhere (see Fig 3.3). In the “dual-task” experiment (Li et al, 2002), subjects were presented with displays containing a central stimulus made of 5 randomly rotated letters (L or T), and, at a random location along an imaginary rectangle (about 5 degrees of eccentricity), a natural scene that could contain one or more animals or vehicles with 50% probability. In some cases, subjects were asked to ignore the central display and decide whether the peripheral scene contained or not an animal, or a vehicle. In other cases, they had to do this categorization task while simultaneously deciding whether the 5 central letters were all identical or whether one of them was different. This latter task is known to require focused attention. It was found that the peripheral natural scene categorization task could be performed equally well alone as in the presence of the distracting central task (i.e. in the “dual-task” condition). In other words, while attention was focused away from the peripheral scene, subjects were still able to detect the presence of an animal or a vehicle.

It appears that, at least under optimal conditions of stimulation in trained subjects, explicit neuronal representations can be activated without the need for focused, top-down attention. This is true not only for lower but also for the higher levels of the cortical visual hierarchy. This idea would seem to contradict the observation that attention is needed to avoid processing failures, as revealed in change or inattentional blindness experiments. But there is in fact no contradiction: in the natural environment, many complications can arise that thwart the feed-forward hierarchical strategy. That is when attention will be called into play.

Page 7: CHAPTER 3. VISUAL ATTENTION AND VISUAL AWARENESS.

7

Fig 3.3. Preattentive natural scene processing. A. In the “dual-task” paradigm, a

peripheral stimulus must be categorized while attention is focused on a demanding central task (visual search for an odd element among 5 randomly rotated Ls or Ts).

B. Natural scene categorization (e.g. animal vs. non-animal, vehicle vs. non-vehicle)

can be performed under these conditions, while other seemingly less complex tasks (randomly rotated L vs. T, bisected 2-color disk vs. its mirror image) suffer greatly from

the removal of attention. Each plot represents performance on the dual-task (each

circle being the average performance of one subject), normalized with respect to the performance of the central letter task (abscissa) or the peripheral task

(ordinate) when performed alone. Adapted from Li et al, (2001). The feed-forward representation strategy reaches its limits in the many situations

where the visual scene fails, at first glance, to activate meaningful representations. This can occur for at least 2 reasons: (i) when the objects or categories to be recognized are

Page 8: CHAPTER 3. VISUAL ATTENTION AND VISUAL AWARENESS.

8

not explicitly represented in the existing cortical hierarchy (novel objects, unusual feature combinations, unusual viewpoints of familiar objects, etc.), or (ii) when the organization of the visual scene (clutter, noise, low contrast, etc.) prevents the activation of the correct neuronal representation(s). In both cases, feedback mechanisms must be recruited to ensure efficient scene processing. These mechanisms constitute what is generally referred to as visual "attention".

In the first case, when the relevant selectivities are simply not expressed directly in the cortical hierarchy, certain neuronal operations must be performed to arrive at a meaningful description of the scene or objects in terms of existing knowledge. This is probably the case for the discrimination tasks presented in Fig 3.3. (bottom: randomly rotated L vs. T; bisected 2-color disk vs. its mirror image), which could not be performed when focal attention was occupied by a central task. “Binding” is an example of one such operation, which will be addressed in the next section. In the second case, when clutter, noise or other aspects of the scene prevent the activation of the correct representations, specific mechanisms guided by prior assumptions about the location or other features of a stimulus can bias processing for this location or feature, and restore the correct output. These mechanisms will be detailed in section 3.3, while the origin of the bias and the structures that are capable of dynamically generating such assumptions will be reviewed in section 3.4.

3.2. ATTENTION AND FEATURE INTEGRATION 3.2.1. Texture segregation.

Bela Julesz was among the first scientists to use the “visual search” paradigm to study the organization of visual processing (Julesz, 1975, 1981, 1986). When one target element, or a group of target elements are embedded in a field of distractors, the portion of the scene made of the target element(s) will sometimes segregate at first glance, i.e. “pop-out”. With other combinations of targets and distractors, “pop-out” will not be observed. Julesz hypothesized that those elements that could easily segregate could be considered as “elementary” features for visual processing or “textons”.

Page 9: CHAPTER 3. VISUAL ATTENTION AND VISUAL AWARENESS.

9

Fig 3.4. Visual search and Feature-Integration Theory. A. Examples of parallel search. The target (odd element) “pops-out” of the display. In practice, the time taken to

detect the presence of the target is independent of the number of distractors in the

search array. B. Examples of serial search. Here, detecting the target requires exploration of the array. The search time increases linearly with the number of

distractors. This type of result is obtained for targets defined by the conjunction of

simple features (left, target is the red bar tilted to the right). The idea behind the Feature-Integration theory is that such conjunctions can only be recognized after

attention has been focused on them, while the detection of the features themselves

is preattentive and parallel. Serial search is also obtained for targets that do not easily segregate against the texture defined by the background elements (middle).

Note that targets defined by 3-D shape-from -shading information (A, right) are

detected more easily than very similar, but 2-D elements (B, right). C. Example of search asymmetry. While detecting the letter Q among Os is effortless, finding the

letter O among Qs is not. D. Examples of search slopes (RT vs set size) obtained for 8

subjects on search tasks similar to the middle panels of A and B. While an increase of set size only mildly increases the search time for a rotated L among +s (search

slope 5 ms/item, i.e. “parallel”), it dramatically affects the search for a rotated L

among Ts (40 ms/item, i.e. “serial”). Data modified from VanRullen et al, (2003).

Page 10: CHAPTER 3. VISUAL ATTENTION AND VISUAL AWARENESS.

10

3.2.2. Visual search, feature integration and the binding problem. A slightly different version of the visual search paradigm has the subject search for

a single target embedded (on a certain proportion of trials) among a field of distractors (e.g. Bergen & Julesz, 1983). Accuracy or, more often, the time taken to find the target are the critical variables. If reaction time is independent of the number of distractors, search is said to be “parallel”. Subjectively, the target “pops-out” from the distractors (Fig 3.4.A). If the search time increases linearly with the number of elements in the array, then search is said to be “serial” (Fig 3.4.B). Anne Treisman is primarily responsible for the popularity of this form of visual search (Treisman & Gelade, 1980). She found that certain object features such as color or orientation could be detected in parallel, while conjunctions of these features could not (compare Fig 3.4.A, left with 3.4.B, left). The term “conjunction search” is now often used as a synonym for “serial search” (incorrectly, as will be explained later). Treisman hypothesized that simple features were represented in parallel across the field, but that their conjunctions could only be recognized after attention had been focused on this particular location. Such shifts of attention are not instantaneous but time-consuming or “serial”. Thus, in the case of feature conjunctions, search times are expected to increase with the number of elements in the array. This hypothesis is referred to as the “Feature Integration” theory of attention (Treisman & Gelade, 1980). Computationally, this scheme avoids a combinatorial explosion of possible objects to be represented. Indeed, only the elementary features need to be explicitly represented within “parallel feature maps” (similar to the orientation maps observed in primary visual cortex), while their potential conjunctions will be dynamically “bound” according to the current task demands. At the same time, this representation strategy allows to disambiguate situations where many objects are present simultaneously, each having their own features to be bound together. For example, a scene containing both a horizontal and a vertical lines, one of them being green and the other red, can be interpreted in two different ways: a red horizontal line and a green vertical, or vice-versa. Determining which of the alternatives is correct is not a trivial problem. Indeed, it was found that under minimal processing times, features from different objects could sometimes be mistakenly bound, creating “illusory conjunctions” (Treisman & Schmidt, 1982). This is a manifestation of the well-known “binding problem” (Von der Marlsburg, 1981; Treisman, 1996, 1999), and the Feature Integration theory was proposed as one attractive solution to this problem (Crick and Koch, 1990): by focusing on one object at the expense of all others, attention disambiguates the interpretation of the visual scene; the perceived color, size, shape, etc. at any one time are “necessarily” associated with the object currently attended.

Note finally that “parallel” search in Treisman’s Feature Integration theory is closely related to “effortless texture segregation” in the texton theory of Julesz (1981, 1986), but not equivalent (Wolfe, 1992).

3.2.3. Modifications to the Feature-Integration Theory Over the last 20 years, Treisman’s original view has been reformulated (Treisman &

Gormican, 1988; Treisman & Sato, 1990; Wolfe et al, 1989) to accommodate new

Page 11: CHAPTER 3. VISUAL ATTENTION AND VISUAL AWARENESS.

11

experimental findings. For example, the observation that certain targets (e.g. the letter Q) could pop-out among distractors (letters O), while one of the distractors would not pop-out among an array of targets (Fig 3.4.C), termed a “search asymmetry”, called for the inclusion of new “feature types” within the theory (Treisman & Gormican, 1988). More problematic was the finding that certain conjunction searches could in fact be performed in parallel (e.g. McLeod et al, 1988; Theeuwes & Kooi, 1994). To account for this, it was proposed that search (and thus, attention) could be guided not only by spatial location but also by prior knowledge about the relevant features, either by inhibiting distractor features (Treisman & Sato, 1990), exciting target features (Wolfe et al, 1989) or both (Driver et al, 1992). More recently, the distinction between parallel and serial search itself has come to be challenged (Wolfe, 1998; Nakayama & Joseph, 1998; Palmer et al, 2000). One of the principal reasons for this was the observation that serial search slopes (i.e. the average increase in search reaction time as the number of distractor increases) in different experiments could take very different values (ranging from about 20 to more than 200 ms/item), and that the exact limit between parallel and serial search slopes was ill-defined (Wolfe, 1998). It is now accepted that visual search performance follows a continuum rather than a true dichotomy. In spite of these limitations, Feature Integration theory has withstood the test of time in a remarkable manner, and the visual search paradigm continues to shape most of today’s research on visual attention.

3.2.4. Interpreting visual search

One problem with the interpretation of visual search results is that in such crowded multi-elements displays, the effects of attention can be confounded with (i) inter-stimulus competition (i.e. clutter) within one receptive field or with (ii) lateral interactions outside the classical receptive field. It is known that clutter decreases an element’s visibility (lateral masking or “crowding” effect; He et al, 1996; Motter & Holsapple, 2000), while lateral interactions such as surface integration (Nakayama & Shimojo, 1992; He & Nakayama, 1992), texture segregation (Julesz, 1981) or distractor grouping can enhance the target’s detectability (Duncan & Humphreys, 1989). In other words, simultaneous presentation of multiple stimuli can affect the processing demands of a task in unexpected ways; thus, visual search results might not apply to isolated objects. To avoid these confounds, it might be more appropriate to directly investigate attentional effects on isolated objects. For example, one might compare recognition performance of isolated objects when attention is or is not available, the so-called “dual-task” paradigm (Sperling and Melchner, 1978; Sperling and Dosher, 1986; Braun and Sagi, 1990; Braun and Julesz, 1998; see also Fig 3.3). Until recently, it was assumed that objects that could be recognized under such dual-task conditions (i.e. “preattentive” elements) corresponded to the “parallel” features observed in visual search (Braun, 1998). For example, oriented bars such as those shown on Fig 3.4.A, left, could be discriminated without attention, while bisected 2-color disks such as those shown on Fig 3.4.B, right, could not (Braun and Julesz, 1998). Yet recent results suggest that visual tasks that are “preattentive” under this dual-task paradigm will not necessarily yield parallel performance during visual search, and vice-versa (Fig 3.5). For example, although

Page 12: CHAPTER 3. VISUAL ATTENTION AND VISUAL AWARENESS.

12

subjects can recognize whether or not a natural scene contains one or more animals even when top-down attention is occupied elsewhere (Fig 3.3; Li et al, 2002), one such “animal” scene cannot “easily” be spotted in a search array containing many other distracting scenes (VanRullen, Reddy & Koch, 2003). Visual search and dual-task paradigms are not interchangeable, and thus may probe slightly different aspects of visual attention. It could be that the “preattentive” features that researchers have been exploring using visual search for the last 20 years might be better revealed using this “dual-task” paradigm. As yet the precise role of attention in these two paradigms (visual search and dual task) is not known, but it appears to be a promising topic for future research.

Fig 3.5. Dual-task and visual search reveal distinct aspects of visual attention. Animal

detection in natural scenes, although “preattentive” in the dual-task paradigm (see

panels C-D, and Fig 3.3), cannot be performed in parallel across the visual field. Visual search performance was found to be serial, both in response-terminated

search (A. dependent variable: search slope) and in masked search (B. dependent

variable: percentage correct). For comparison, two classical examples of parallel (rotated L vs. +) and serial (rotated L vs. T) search were performed by the same

subjects during the same experimental sessions. On the other hand, some

discrimination tasks that are known to be parallel in visual search could not be performed without attention in dual-task: C. rotated L vs. + (compare with the black

curves in A and B); D. depth-rotated cubes. For comparison, the same subjects also

Page 13: CHAPTER 3. VISUAL ATTENTION AND VISUAL AWARENESS.

13

performed the animal categorization task (preattentive) during the same

experimental sessions. Adapted from VanRullen et al (2003).

3.2.5. Summary To summarize, this line of research suggests that one of the roles of attention is to

link or “bind” preattentive “features” to dynamically construct more complex representations, in particular in situations where the object to be recognized does not correspond directly to any existing neuronal selectivity. In more general terms, attention might be necessary to perform certain mental operations (including “binding”) that would allow the system to match the input to one or more stored representations.

3.3. ATTENTION AND BIASED COMPETITION Even when the object to be processed by the system is matched to an existing

neuronal representation (say, a picture of a face), the firing of the relevant neurons can often be prevented by certain aspects of the scene such as clutter, low contrast or external noise. It is one of the roles of attention to selectively enhance the target signal in order to restore the correct neuronal response. One particularly interesting aspect of this form of attention is that it has been studied using not only psychophysics, but also electrophysiological and neurophysiological techniques. We thus have a relatively clear idea of the effects of attention at the single-cell level.

3.3.1. Attention at the single-cell level

For the purpose of illustrating these results, let us take the example of say, an orientation-selective neuron in area V4. The locus of top-down attention is manipulated by having the monkey expect a task-relevant stimulus (often predicting a reward) at a particular location. A task-irrelevant stimulus can then be flashed unexpectedly, either near or far from the focus of attention. First, the effects of attention can be investigated using an isolated stimulus. In that case, the net attentional effect can be compared to a multiplicative scaling (McAdams & Maunsell, 1999): the strength of responses near the focus of attention is directly proportional (with a factor greater than 1) to the responses obtained for the same stimuli, away from attention. The magnitude of these effects depends in fact on the distance between the stimulus and the focus of attention (Connor et al, 1996). But the most surprising effects of attention are revealed when more than one stimulus is presented at once. As represented in Fig 3.6, presenting 2 stimuli (one “preferred” and one “non-preferred”) simultaneously inside one neuron’s receptive field will yield a response that is generally a weighted sum (i.e. intermediate) between the responses elicited by each stimulus independently. This can be interpreted as a simple form of “clutter” or competition between the neuron’s inputs. The effect of attention is to enhance the competitive value of one or the other stimulus: if drawn to the “preferred” stimulus, the neuron’s response is increased; if drawn to the “non-preferred” stimulus, it is decreased (Moran & Desimone, 1985; Reynolds et al, 1999). In fact, this effect is comparable to what would be observed if one of the stimuli had its contrast increased relative to the other stimulus (Reynolds et al, 2000). Thus, the effects of attention at the single-cell level can be described by an increase in contrast sensitivity. This property also

Page 14: CHAPTER 3. VISUAL ATTENTION AND VISUAL AWARENESS.

14

accounts, in a rather elegant manner, for the multiplicative scaling observed when attention is drawn to one isolated stimulus (Reynolds and Desimone, 1999).

These attentional effects are not restricted to area V4, and similar observations were made in IT (e.g. Moran & Desimone, 1985), as well as cortical areas MT and MST (e.g. Treue & Maunsell, 1996), and V2 (e.g. Reynolds et al, 1999). While the small receptive field size of neurons in V1 often precludes the simultaneous presentation of 2 stimuli, attentional effects can be revealed using alternative paradigms (Motter, 1993; Vidyasagar, 1998; Roelfsema et al, 1998). fMRI in humans has confirmed this body of results (Tootell et al, 1998; Kastner et al, 1998), and in particular the influence of attention on V1 activity (Watanabe et al, 1998; Somers et al, 1999; Brefczynski & DeYoe, 1999; Ress et al, 2000). However, the cellular basis of the fMRI BOLD signal is only starting to be understood (Logothetis et al, 2001), and the exact relationship between attentional effects in single-cell recordings (usually in monkeys) vs. functional imaging (usually in humans) remains unclear (Heeger & Ress, 2002).

Fig 3.6. Attention at the neuronal level. The effects of attention are most apparent when two stimuli are presented simultaneously and compete for the neuron’s

responses. This hypothetical V4 neuron shows strong responses to a vertical bar

presented in isolation (“good” stimulus) and weak responses to a horizontal bar (“poor” stimulus). When both stimuli are presented at the same time (“pair”), and

attention is maintained away from the receptive field, the resulting neuronal

response is intermediate between the two previous responses (dashed line). Attention draws the neuron’s response towards the response that would be elicited

by the attended stimulus in isolation (gray lines). When attention is focused on the

“good” stimulus, the response to the pair is increased. When attention is focused on the “poor” stimulus, the response to the pair is decreased. Attention confers a

competitive advantage to one of the stimuli. Adapted from Reynolds & Desimone

(1999).

Page 15: CHAPTER 3. VISUAL ATTENTION AND VISUAL AWARENESS.

15

3.3.2. The “Biased Competition” framework At the system level, the effects of attention can be alternatively described as a

“winner-take-all” mechanism (Lee et al, 1999), an increase of effective contrast (Reynolds & Desimone, 1999), a gain control device (MacAdams & Maunsell, 1999; Treue & Martinez Trujillo, 1999; Salinas & Abbott, 1997), or a signal enhancement/noise reduction process (Carrasco et al, 2000; Dosher & Lu, 2000). The common denominator is the role attributed to attention: to restore the output (i.e. neuronal response) that would be obtained if clutter, low contrast or external noise had not disrupted it. These aspects of the organization of visual processing were outlined in the influential “Biased Competition framework” of Desimone and Duncan (1995). Their principal idea is that all visual stimuli (including noise, etc.) are constantly competing for neuronal resources. For example, a “good” and a “poor” stimulus simultaneously presented in the same neuron’s receptive field compete to dominate the neuron’s output. Without attention, the net result of this competition will be an intermediate response between those elicited by each stimulus alone (see Fig 3.6). The information conveyed by this neuron will not reflect either of the stimuli presented, and subsequent failure of recognition is more likely to occur. External factors, such as an elevation of contrast, can give a competitive advantage to one of the stimuli. Desimone and Duncan’s framework states that the role of attention will be to bias this competition in favor of the attended stimulus. Essentially, the attentional bias is equivalent to a contrast elevation of the attended element or feature. Because of the bias, the confusion is resolved, and the system can behave as if only the attended stimulus was present.

One question arises naturally at this point: “where does the attentional bias come from?” In fact, both top-down (voluntary, driven by the observer’s internal motivations) and bottom-up (involuntary, driven by the stimulus itself) biases can coexist. These are often referred to simply as top-down vs. bottom-up attention. Although their effects on neuronal and behavioral performance might be similar, they have very distinct dynamics, the latter being more transient and the former more sustained (Weichselgartner & Sperling, 1987; Nakayama & Mackeben, 1989). The next section covers the potential origins of attentional signals, with an emphasis on bottom-up, saliency-driven attention. 3.4. ATTENTION ORIENTING SYSTEMS 3.4.1. Visual neglect and extinction.

The condition of “hemi-spatial” or “unilateral” neglect informs us about the structures involved in the orienting of attention (Driver & Mattingley, 1998). This clinical syndrome affects patients having suffered lesions to posterior parietal areas, generally in the right hemisphere (this is due to a hemispheric asymmetry in spatial cognition, which is not yet fully understood; Husain & Rorden, 2003). Typically, such patients have deficits in perceiving objects on the contralesional (i.e., left) side of the world. This impairment is not expressed in purely retinal coordinates (i.e. it is not a “scotoma”) but depends also on the direction of the head and body (e.g. Kooistra & Heilman, 1989), and that of attention. For example, a neglect patient asked to copy drawings of objects will typically

Page 16: CHAPTER 3. VISUAL ATTENTION AND VISUAL AWARENESS.

16

omit to draw the left side of each object (Driver et al, 1992; Driver, 1995), irrespective of their position on the paper. In a variant of this syndrome (or maybe a milder form of impairment), called “visual extinction”, a single isolated stimulus is generally perceived normally when placed in the contralesional field by itself, yet the same stimulus cannot be perceived if a second object is presented simultaneously to the other hemifield (thus, the ipsilesional object “extinguishes” the percept of the contralesional object). This aspect is reminiscent of the neuronal competition between stimuli emphasized by Desimone and Duncan (1995), and of the need for attention to “bias” this competition in favor of one or the other object. Thus, extinction and neglect are generally interpreted as an impairment for shifting attention on the contralesional side (Posner et al, 1984; Baylis et al, 1993).

The exact anatomical site of visual neglect is the subject of recent controversy, with some evidence that the site of lesions in humans might be less dorsal than previously thought, i.e. near the temporo-parietal junction or even in superior temporal areas (Karnath et al, 2001). Corbetta and Shulman (2002) propose to distinguish between a temporo-parietal complex involved in the capture of bottom-up, saliency-based attention, and a more dorsal, intra-parietal complex responsible for top-down, task-dependent control of attention. Lesions to either of these complexes (or both) could produce neglect symptoms. These distinctions aside, one can reasonably conclude that posterior parietal cortex, as a whole, seems to be one of the main actors of attentional orienting.

3.4.2. The saliency map hypothesis. What are the exact computational steps leading to the decision of directing

attention to one object in the scene, at the expense of others? The “saliency map” (Koch & Ullman, 1985) was one of the first computational hypotheses to address this question. This concept refers to a 2-dimensional retinotopic map of the visual scene, in which the quantity represented (e.g. by the firing of neurons) is the overall saliency at each location, independent of which feature dimension (contrast, color, movement, etc.) accounts for this saliency (Itti & Koch, 2001a). The system, illustrated in Fig 3.7, extracts stimulus intensity values in parallel along a number of feature dimensions (e.g. luminance, color, orientation, motion, stereo disparity, etc…) and at many different spatial scales, for each location in the visual field. This is modeled by multiscale feature pyramids. The “contextual” aspects of saliency within each feature dimension are implemented by center-surround difference mechanisms and long-range lateral interactions. Activities in the resulting feature maps are linearly combined to feed into the saliency map. The relative weights attributed to each feature dimension in this summation determine the extent to which this dimension participates in the computation of saliency (Itti & Koch, 2001b). These weights can therefore be dynamically adjusted by prior experience (i.e. by learning the statistics of the visual world) or by top-down attentional biases.

In the saliency map, a “winner-take-all” mechanism sequentially determines which location to attend to: this is the actual purpose of the saliency map. Information from the winning locations will be transmitted to areas responsible for producing eye

Page 17: CHAPTER 3. VISUAL ATTENTION AND VISUAL AWARENESS.

17

movements or attentional shifts. An additional process called “inhibition of return” ensures that this winner-take-all competition will not continuously result in the same winning location. It does so by temporarily suppressing the saliency associated with the current winning location. Note that such a computational mechanism has found reliable experimental support from human psychophysics (reviewed in Klein, 2000).

Fig 3.7. Directing bottom -up attention: the saliency map hypothesis. An input scene

is decomposed in parallel into many basic feature dimensions (e.g. intensity, orientation, color, motion, temporal change, etc…) encoded in multi-scale

pyramids. Spatial competition takes place in these channels through lateral

interactions and center-surround competition. Such mechanisms are meant to implement the effects of spatial context on saliency. The resulting feature maps are

combined through a linear summation whose weights are fixed by prior experience,

Page 18: CHAPTER 3. VISUAL ATTENTION AND VISUAL AWARENESS.

18

and can be modulated by current behavioral purposes (e.g. top-down attention).

The saliency map is the last stage of this system. It represents saliency in a

retinotopic fashion. A winner-take-all mechanism picks the most salient location in this map, and transmits its results to the networks responsible for shifting attention. In

addition, a process called “inhibition of return” ensures that the winning location is

temporarily suppressed in the saliency map, so as to allow attention to explore other locations. Adapted from Itti and Koch (2001a). As for the saliency map itself, recent electrophysiological evidence points to not

just one but multiple candidate areas: the pulvinar (Laberge & Buchsbaum, 1990; Robinson & Petersen, 1992), the superior colliculus (Kustov & Robinson, 1996), the frontal eye field (Thompson & Schall, 2000) or the posterior parietal cortex (Gottlieb et al, 1998; Colby & Goldberg, 1999; Bisley & Goldberg, 2003). Neurons in posterior parietal areas are known to represent space in multiple coordinate systems, including body-centered and world-centered reference frames (Andersen et al, 1997; Snyder et al, 1998). Their firing can be related to motor planning or “intention” (Snyder et al, 1997), and is modulated by the position of the attentional focus (Steinmetz & Constantinidis, 1995; Bisley & Goldberg, 2003). The involvement of posterior parietal cortex in visual neglect, together with these observations, suggests that these parietal areas must participate in the guidance of attention. Feedback from parietal cortex into the ventral object recognition stream might constitute the “bias” posited by the biased competition framework.

Beyond this particular model relying on a saliency map, there exist other means of implementing the biased competition framework within a model visual system. For example, Corchs and Deco have recently proposed a model where competition is implemented within dynamic interactions between ventral and dorsal modules (Corchs & Deco, 2002). Interestingly, this type of model can account for the symptoms observed in neglect: selective “lesioning” of the dorsal module results in spatially specific impairment of the recognition performed in the ventral module (Rolls & Deco, 2002). In a related proposal by Hamker (2000), attention emerges naturally (i.e., without an explicit saliency map) from the dynamically recurring interactions between an object processing stream and a fronto-parietal network implicated in the planning of eye movements. This model thus puts together the biased competition framework and the premotor theory of attention, whereby covert attention reflects in fact the preparation of overt eye movements (Rizzolatti et al, 1987).

3.5. SUMMARY: THE ROLES OF ATTENTION To recapitulate the substance of this chapter so far, object processing in the

ventral hierarchy can proceed without attentional requirements, at least in a world consisting of familiar, isolated, high-contrast stimuli. In the real world where visual stimuli can be either unfamiliar, or plagued by external noise, low contrast, clutter (etc.), competition between candidate neuronal representations often results in processing failures or confusion. Attention, in a very general sense, can be seen as a set of processing mechanisms and strategies designed to overcome these limitations.

Page 19: CHAPTER 3. VISUAL ATTENTION AND VISUAL AWARENESS.

19

(i) When the stimulus to be recognized is either unfamiliar, or is presented under an unusual viewpoint, or is constituted of an unusual collection of familiar elements, attention can act by initiating a set of neuronal operations (e.g. binding) aimed at matching this input to one or more stored representations.

(ii) When competition among inputs (potentially including external noise) prevents the correct functioning of the visual system, attention can act by selectively biasing the competition towards one of the competing representations, i.e. one object, its location or another target feature. The bias can arise from interactions with posterior parietal regions, encoding space and attentional priority on the basis of saliency (bottom-up) or task requirements (top-down).

3.6. VISUAL AWARENESS How do we go from a visual stimulus to its conscious representation? And, in

particular, what is the link between attention and awareness? For many psychologists, axiomatically, the contents of awareness correspond exactly to what is being attended (e.g. Posner, 1994). However, the real story is likely to be more complicated. Studies of change blindness (e.g. Rensink et al, 1997) or inattentional blindness (Mack and Rock, 1998) would suggest that attention is necessary for awareness: unattended aspects of the scene simply go unnoticed, hence the term “blindness”. But others argue that awareness can exist outside the focus of attention, in particular Braun (1998; see also Braun & Julesz, 1990), founding his opinion on the results of dual-task experiments. Indeed, the remarkable residual perceptual abilities that subjects appear to enjoy in the periphery, while their attention is tied down at the center of the visual field, do not support the idea of a pure and simple “blindness” when attention is unavailable. For Rensink (2000), high-level representations of objects or “proto-objects” can be formed preattentively. In a recent proposal, Lamme (2003) even suggests that visual awareness could precede the stage of attentional selection. Finally, simple introspection tells us that we are certainly not blind around the object of our attention. Thus, equating attention with awareness would undoubtedly be a mistake, although the exact relationship between attentional and conscious selection still remains to be defined.

In terms of the cortical hierarchical organization emphasized in previous sections, the activation of “esential nodes” (Zeki, 2001) might be a necessary condition for awareness (Crick and Koch, 2003): for example, activation of certain infero-temporal neurons for faces (or in humans, of the Fusiform Face Area; Kanwisher et al, 1997), or certain MT neurons for movement. This activity in itself might not be “sufficient” for awareness (i.e. such nodes can show activation without an associated percept; Rees et al, 2000; Bar et al, 2001; Moutoussis & Zeki, 2002), but might be conveyed forward to higher-level planning centers (Crick & Koch, 1995), or backwards to lower-level visual areas (e.g. Bullier, 2001). Note that the posterior parietal cortex might not be strictly “necessary” for consciousness: patients suffering from Balint’s syndrome or “simultanagnosia” (resulting from a bilateral lesion of parietal cortices) can only perceive one object at a time, yet they are certainly conscious of it (Robertson et al, 1997). In fact, the locus of visual awareness is still an open question, subject of intense debate. There are, at least, 3 possibilities.

Page 20: CHAPTER 3. VISUAL ATTENTION AND VISUAL AWARENESS.

20

• High-level theory of awareness. Because of the complex nature of awareness (and its difficulty to apprehend

from a scientific point of view), and its intimate relation to our own thoughts, awareness naturally appears as the “ultimate” level of representation, the endpoint of the cortical hierarchy. This is a common view within scientific circles. For instance, Crick and Koch (1995) proposed that the purpose of awareness was to “produce the best current interpretation of the visual scene […] and to make it available, for a sufficient time, to the parts of the brain that contemplate, plan and execute voluntary motor outputs”. In this context, awareness could only arise from visual cortical areas having direct connections to the planning centers of prefrontal cortex. Based on the lack of direct connections between V1 and prefrontal areas forward to the central sulcus in the macaque monkey, Crick and Koch (1995) argued that V1, although essential for normal seeing, does not contain the neuronal representations that directly give rise to conscious, visual perception. This is compatible with most of the single-cell electrophysiological data from the macaque monkey (Logothetis, 1998).

• Low-level theory of awareness. A recent trend among some visual neuroscientists consists in taking the opposite

viewpoint: visual awareness would arise in lower-level cortical areas. Admittedly, the selectivities of V1 neurons as described in the pioneering experiments of Hubel and Wiesel (1968) appear too “simple” for supporting conscious representations, but more recent experiments have revealed that the late part of V1 neurons’ responses could reflect higher-level processes such as figure-ground object segmentation (Zipser et al, 1996; Hupe et al, 1998) or attention (Roelfsema et al, 1998), and could even be “correlated” with subjective perception (Super et al, 2001; see, however, Cumming & Parker, 1997 for one of several contradictory results). These late but complex responses are likely to reflect feedback from extrastriate areas (Lamme & Roelfsema, 2000; Bullier et al, 2001). Thus it was suggested that primary visual cortex could be serving as an active “blackboard” for visual perception (Bullier, 2001), on which the results of specific computations in certain higher-level areas would be “projected back”, and hence made available to the rest of the visual cortical hierarchy. It is this “back-projection” of information onto V1 that would constitute the neural correlate of the “aware” percept (Lamme et al, 2000). Further experimental support has been provided by a transcranial magnetic stimulation (TMS) experiment showing that visual awareness of movement could be prevented by “selectively” (in the time domain, rather than in the space domain) disrupting feedback from area MT (the “explicit node” for movement) to V1 (Pascual-Leone & Walsh, 2001). Yet this hypothesis also suffers from major shortcomings, for example the fact that V1 activity is greatly reduced during dreams, while awareness itself is still present (Hobson et al, 1998). Conversely, during neglect or extinction, V1 activity can be preserved while awareness is disrupted (Rees et al, 2000; Vuilleumier et al, 2001). Thus it is difficult to argue that V1 could be the seat of consciousness.

• Intermediate-level theory of awareness.

Page 21: CHAPTER 3. VISUAL ATTENTION AND VISUAL AWARENESS.

21

Yet another idea was put forth by Jackendoff (1987), regarding the organization of language: he postulated that we are not directly aware of our thoughts, but only of a sensory representation of them (e.g. in silent speech). In the visual domain, the equivalent theory would be one where higher-level concepts or categories such as those represented in parts of prefrontal cortex would not be accessible to awareness (Crick and Koch, 2003). This is consistent with the fact that properties of higher-level neurons such as “viewpoint” or “position invariance” have no direct correspondence in our subjective perceptions. For instance, it is well known that certain IT neurons that respond selectively to complex objects (e.g. faces) can do so under various viewpoints, and, to some extent, independently of the object’s position (Tanaka, 1996). Yet objects are subjectively experienced with a definite position, under a given viewpoint; we cannot see a face as a profile and a front view at the same time. The selectivities of intermediate-level neurons (e.g. V4, posterior infero-temporal cortex), which are more sensitive to such viewpoint or location changes, might thus describe our conscious visual experience more accurately. This position would predict that as one moves up the visual hierarchy and into medial temporal and prefrontal regions, the proportion of neurons that directly express the content of conscious visual perception might peak and then decrease.

Note that, depending on exactly what areas are considered as “intermediate”, this theory can be more or less compatible with higher-level theories (e.g. if the “intermediate” areas have direct projections to frontal cortex) as well as lower-level theories (e.g. if V2 is considered as an “intermediate” area).

These theories of visual awareness are, at this stage, necessarily bold, since the

available evidence is sparse, and often contradictory. They will certainly need significant revisions, and the truth might well lie in a mixture of these ideas, or in a totally different, unexplored direction. However, this field of investigation is rather new –at least, it is only recently that scientists have overtly started to manifest interest in the study of awareness, and the next decades will certainly teach us a great deal about the workings of consciousness.

3.7. SUMMARY. We review current theories of selective visual attention and awareness, illustrated

by the relevant psychophysical, clinical, electrophysiological and functional imaging experimental results. The data is interpreted in light of the known neuronal structures and mechanisms, in particular from macaque monkey whenever human experiments are too sparse or inexistent. We first define the functions of selective attention, then review its effects at the behavioral and neuronal level. The structures responsible for orienting visual attention in space are mentioned, with an emphasis on the concept of “saliency map”. Finally, an overview of current ideas on the problem of the neural correlates of visual awareness is given.

Page 22: CHAPTER 3. VISUAL ATTENTION AND VISUAL AWARENESS.

22

3.8. REFERENCES

Allison, T., Puce, A., Spencer, D. D., & McCarthy, G. (1999). Electrophysiological studies of human face

perception. I: Potentials generated in occipitotemporal cortex by face and non-face stimuli. Cereb

Cortex, 9(5), 415-430.

Andersen, R. A., Snyder, L. H., Bradley, D. C., & Xing, J. (1997). Multimodal representation of space in the

posterior parietal cortex and its use in planning mov ements. Annu Rev Neurosci, 20, 303-330.

Baizer, J. S., Ungerleider, L. G., & Desimone, R. (1991). Organization of visual inputs to the inferior temporal and

posterior parietal cortex in macaques. J Neurosci, 11(1), 168-190.

Bar, M., Tootell, R. B., Schacter, D. L., Greve, D. N., Fischl, B., Mendola, J. D., Rosen, B. R., & Dale, A. M. (2001).

Cortical mechanisms specific to explicit visual object recognition. Neuron, 29(2), 529-535.

Barlow, H. B. (1972). Single units and sensation: a neuron doctrine for perceptual psychology? Perception, 1(4),

371-394.

Baylis, G. C., Driver, J., & Rafal, R. D. (1993). Visual extinction and stimulus repetition. J Cog Neuroscience, 5, 453-

466.

Bergen, J. R., & Julesz, B. (1983). Parallel versus serial processing in rapid pattern discrimination. Nature,

303(5919), 696-698.

Bisley, J. W., & Goldberg, M. E. (2003). Neuronal activity in the lateral intraparietal area and spatial attention.

Science, 299(5603), 81-86.

Blake, R., & Logothetis, N. K. (2002). Visual competition. Nat Rev Neurosci, 3(1), 13-21.

Bonneh, Y. S., Cooperman, A., & Sagi, D. (2001). Motion-induced blindness in normal observers. Nature,

411(6839), 798-801.

Braun, J., & Sagi, D. (1990). Vision outside the focus of attention. Percept Psychophys, 48(1), 45-58.

Braun, J., & Julesz, B. (1998). Withdrawing attention at little or no cost: detection and discrimination tasks.

Percept Psychophys, 60(1), 1-23.

Braun, J. (1998). Divided attention: narrowing the gap between brain and behavior. The attentive brain. R.

Parasuraman. Cambridge, MA:, MIT Press, 327-351.

Brefczynski, J. A. and E. A. DeYoe (1999). A physiological correlate of the 'spotlight' of visual attention. Nat

Neurosci 2(4), 370-4.

Bullier, J., Kennedy, H., & Salinger, W. (1984). Branching and laminar origin of projections between visual cortical

areas in the cat. J Comp Neurol, 228(3), 329-341.

Bullier, J. (2001). Feedback connections and conscious vision. Trends Cogn Sci 5(9), 369-370.

Bullier, J., J. M. Hupe, A. C. James and P. Girard (2001). The role of feedback connections in shaping the

responses of visual cortical neurons. Prog Brain Res 134, 193-204.

Carrasco, M., Penpeci-Talgar, C., & Eckstein, M. (2000). Spatial covert attention increases contrast sensitivity

across the CSF: support for signal enhancement. Vision Res, 40(10-12), 1203-1215.

Chao, L. L., Martin, A., & Haxby, J. V. (1999). Are face-responsive regions selective only for faces? Neuroreport,

10(14), 2945-2950.

Colby, C. L., & Goldberg, M. E. (1999). Space and attention in parietal cortex. Annu Rev Neurosci, 22, 319-349.

Connor, C. E., J. L. Gallant, D. C. Preddie and D. C. Van Essen (1996). Responses in area V4 depend on the

spatial relationship between stimulus and attention. J Neurophysiol 75(3), 1306-8.

Corbetta, M., & Shulman, G. L. (2002). Control of goal-directed and stimulus-driven attention in the brain. Nat

Rev Neurosci, 3(3), 201-215.

Page 23: CHAPTER 3. VISUAL ATTENTION AND VISUAL AWARENESS.

23

Corchs, S., & Deco, G. (2002). Large-scale neural model for visual attention: integration of experimental single-

cell and fMRI data. Cereb Cortex, 12(4), 339-348.

Crick, F., & Koch, C. (1990). Towards a neurobiological theory of consciousness. Seminars in the Neuroscience.,

2, 263-275.

Crick, F. & C. Koch (1995). Are we aware of neural activity in primary visual cortex?. Nature 375(6527), 121-3.

Crick, F., & Koch, C. (2003). A framework for consciousness. Nat Neurosci, 6(2), 119-126.

Cumming, B. G. and A. J. Parker (1997). Responses of primary visual cortical neurons to binocular disparity

without depth perception. Nature 389(6648), 280-3.

Deco, G., & Rolls, E. T. (2002). Object-based visual neglect: a computational hypothesis. Eur J Neurosci, 16(10),

1994-2000.

Desimone, R., & Duncan, J. (1995). Neural mechanisms of selective visual attention. Annual Review of

Neuroscience., 18, 193-222.

Dosher, B. A., & Lu, Z. L. (2000). Noise exclusion in spatial attention. Psychol Sci, 11(2), 139-146.

Driver, J., P. McLeod and Z. Dienes (1992). Motion coherence and conjunction search: implications for guided

search theory. Percept Psychophys 51(1), 79-85.

Driver, J., Baylis, G. C., & Rafal, R. D. (1992). Preserved figure-ground segregation and symmetry perception in

visual neglect. Nature, 360(6399), 73-75.

Driver, J. (1995). Object segmentation and visual neglect. Behav Brain Res, 71(1-2), 135-146.

Driver, J., & Mattingley, J. B. (1998). Parietal neglect and visual awareness. Nat Neurosci, 1(1), 17-22.

Duncan, J., & Humphreys, G. W. (1989). Visual search and stimulus similarity. Psychol Rev, 96(3), 433-458.

Felleman, D. J., & Van Essen, D. C. (1991). Distributed hierarchical processing in the primate cerebral cortex.

Cereb Cortex, 1(1), 1-47.

Fukushima, K., & Miyake, S. (1982). Neocognitron: A new algorithm for pattern recognition tolerant of

deformations and shifts in position. Pattern Recognition, 15, 455-469.

Gallant, J. L., Braun, J., & Van Essen, D. C. (1993). Selectivity for polar, hyperbolic, and Cartesian gratings in

macaque visual cortex. Science, 259(5091), 100-103.

Gauthier, I., Skudlarski, P., Gore, J. C., & Anderson, A. W. (2000). Expertise for cars and birds recruits brain areas

involved in face recognition. Nat Neurosci, 3(2), 191-197.

Ghose, G. M., & Ts'o, D. Y. (1997). Form processing modules in primate area V4. J Neurophysiol, 77(4), 2191-2196.

Gottlieb, J. P., Kusunoki, M., & Goldberg, M. E. (1998). The representation of visual salience in monkey parietal

cortex. Nature, 391(6666), 481-484.

Gross, C. G., Rocha-Miranda, C. E., & Bender, D. B. (1972). Visual properties of neurons in inferotemporal cortex

of the Macaque. J Neurophysiol, 35(1), 96-111.

Hamker, F. H. (2000). Distributed competition in directed attention. In v. G. Baratoff & H. Neumann (Eds.),

Proceedings in Artificial Intelligence (Vol. 9, pp. 39-44). Berlin: AKA, Akademische Verlagsgesellschaft.

He, Z. J., & Nakayama, K. (1992). Surfaces versus features in visual search. Nature, 359(6392), 231-233.

He, S., Cavanagh, P., & Intriligator, J. (1996). Attentional resolution and the locus of visual awareness. Nature,

383(6598), 334-337.

Heeger, D. J., & Ress, D. (2002). What does fMRI tell us about neuronal activity? Nat Rev Neurosci, 3(2), 142-151.

Hobson, J. A., E. F. Pace-Schott, R. Stickgold and D. Kahn (1998). To dream or not to dream? Relevant data from

new neuroimaging and electrophysiological studies. Curr Opin Neurobiol 8(2), 239-44.

Hubel, D. H., & Wiesel, T. N. (1968). Receptive fields and functional architecture of the monkey striate cortex. J

Physiol (London), 195, 574-591.

Humphreys, G. W., & Riddoch, M. J. (1987). To See But Not To See: A Case Study of Visual Agnosia.: Erlbaum.

Page 24: CHAPTER 3. VISUAL ATTENTION AND VISUAL AWARENESS.

24

Hupe, J. M., A. C. James, B. R. Payne, S. G. Lomber, P. Girard and J. Bullier (1998). Cortical feedback improves

discrimination between figure and background by V1, V2 and V3 neurons. Nature 394(6695), 784-7.

Husain, M., & Rorden, C. (2003). Non-spatially lateralized mechanisms in hemispatial neglect. Nat Rev Neurosci,

4(1), 26-36.

Itti, L., & Koch, C. (2001a). Computational modelling of visual attention. Nat Rev Neurosci, 2(3), 194-203.

Itti, L., & Koch, C. (2001b). Feature combination strategies for saliency-based visual attention systems. Journal of

Electronic Imaging, 10(1), 161-169.

Jackendoff, R. S. (1987). Consciousness and the computational mind.: MIT Press.

Julesz, B. (1975). Experiments in the visual perception of texture. Sci Am, 232(4), 34-43.

Julesz, B. (1981). Textons, the elements of texture perception, and their interactions. Nature, 290(5802), 91-97.

Julesz, B. (1986). Texton gradients: the texton theory revisited. Biol Cybern, 54(4-5), 245-251.

Kanwisher, N., McDermott, J., & Chun, M. M. (1997). The fusiform face area: a module in human extrastriate

cortex specialized for face perception. J Neurosci, 17(11), 4302-4311.

Karnath, H. O. (2001). New insights into the functions of the superior temporal cortex. Nat Rev Neurosci, 2(8),

568-576.

Karnath, H. O., Ferber, S., & Himmelbach, M. (2001). Spatial awareness is a function of the temporal not the

posterior parietal lobe. Nature, 411(6840), 950-953.

Kastner, S., P. De Weerd, R. Desimone and L. G. Ungerleider (1998). Mechanisms of directed attention in the

human extrastriate cortex as revealed by functional MRI. Science 282(5386), 108-11.

Klein, R. M. (2000). Inhibition of return. Trends Cogn Sci, 4(4), 138-147.

Koch, C., & Ullman, S. (1985). Shifts in selective visual attention: towards the underlying neural circuitry. Human

Neurobiology, 4(4), 219-227.

Koch, C. (2003). The quest for consciousness: a neurobiological perspective. CA: Roberts and Company

Publishers.

Kooistra, C. A., & Heilman, K. M. (1989). Hemispatial visual inattention masquerading as hemianopia. Neurology,

39(8), 1125-1127.

Kreiman, G., Koch, C., & Fried, I. (2000). Category-specific visual responses of single neurons in the human

medial temporal lobe. Nat Neurosci, 3(9), 946-953.

Kustov, A. A., & Robinson, D. L. (1996). Shared neural control of attentional shifts and eye movements. Nature,

384(6604), 74-77.

LaBerge, D., & Buchsbaum, M. S. (1990). Positron emission tomographic measurements of pulvinar activity during

an attention task. J Neurosci, 10(2), 613-619.

Lamme, V. A. and P. R. Roelfsema (2000). The distinct modes of vision offered by feedforward and recurrent

processing. Trends Neurosci 23(11), 571-9.

Lamme, V. A., H. Super, R. Landman, P. R. Roelfsema and H. Spekreijse (2000). The role of primary visual cortex

(V1) in visual awareness. Vision Res 40(10-12), 1507-21.

Lamme, V. A. (2003). Why visual attention and awareness are different. Trends Cogn Sci 7(1), 12-18.

Lee, D. K., Itti, L., Koch, C., & Braun, J. (1999). Attention activates winner-take-all competition among visual

filters. Nat Neurosci, 2(4), 375-381.

Li, F. F., VanRullen, R., Koch, C., & Perona, P. (2002). Rapid natural scene categorization in the near absence of

attention. Proc Natl Acad Sci U S A, 99(14), 9596-9601.

Logothetis, N. K., & Sheinberg, D. L. (1996). Visual object recognition. Annu Rev Neurosci, 19, 577-621.

Logothetis, N. K. (1998). Single units and conscious vision. Philos Trans R Soc Lond B Biol Sci, 353(1377), 1801-1818.

Page 25: CHAPTER 3. VISUAL ATTENTION AND VISUAL AWARENESS.

25

Logothetis, N. K., Pauls, J., Augath, M., Trinath, T., & Oeltermann, A. (2001). Neurophysiological investigation of

the basis of the fMRI signal. Nature, 412(6843), 150-157.

Mack, A., & Rock, I. (1998). Inattentional Blindness. Cambridge MA: MIT Press.

McAdams, C. J. and J. H. R. Maunsell (1999). Effects of attention on orientation-tuning functions of single

neurons in macaque cortical area V4. J Neurosci 19(1), 431-41.

McLeod, P., J. Driver and J. Crisp (1988). Visual search for a conjunction of movement and form is parallel.

Nature 332(6160), 154-5.

Milner, A. D., & Goodale, M. A. (1995). The visual brain in action.: Oxford University Press.

Moran, J. and R. & Desimone (1985). Selective attention gates visual processing in the extrastriate cortex.

Science 229, 782-784.

Morel, A., & Bullier, J. (1990). Anatomical segregation of two cortical visual pathways in the macaque monkey.

Vis Neurosci, 4(6), 555-578.

Motter, B. C. (1993). Focal attention produces spatially selective processing in visual cortical areas V1, V2, and

V4 in the presence of competing stimuli. Journal of Neurophysiology. 70, 909-919.

Motter, B. C. and J. W. Holsapple (2000). Cortical image density determines the probability of target discovery

during active search. Vision Res 40(10-12), 1311-22.

Moutoussis, K. and S. Zeki (2002). The relationship between cortical activation and perception investigated with

invisible stimuli. Proc Natl Acad Sci U S A 99(14), 9527-32.

Nakayama, K., & Mackeben, M. (1989). Sustained and transient components of focal visual attention. Vision

Res, 29(11), 1631-1647.

Nakayama, K., & Shimojo, S. (1992). Experiencing and perceiving visual surfaces. Science, 257(5075), 1357-1363.

Nakayama, K. and J. S. Joseph (1998). Attention, pattern recognition and pop-out in visual search. The

attentive brain. R. Parasuraman. Cambridge, MA:, MIT Press, 279-298.

Newsome, W. T., Britten, K. H., & Movshon, J. A. (1989). Neuronal correlates of a perceptual decision. Nature,

341(6237), 52-54.

O'Regan, J. K., Rensink, R. A., & Clark, J. J. (1999). Change-blindness as a result of 'mudsplashes' [letter]. Nature,

398(6722), 34.

O'Regan, J. K., & Noe, A. (2001). A sensorimotor account of vision and visual consciousness. Behav Brain Sci,

24(5), 939-973; discussion 973-1031.

Palmer, J., P. Verghese and M. Pavel (2000). The psychophysics of visual search. Vision Res 40(10-12), 1227-68.

Pascual-Leone, A. and V. Walsh (2001). Fast backprojections from the motion to the primary visual area

necessary for visual awareness. Science 292(5516), 510-2.

Perrett, D. I., Rolls, E. T., & Caan, W. (1982). Visual neurons responsive to faces in the monkey temporal cortex.

Experimental Brain Research, 47, 329-342.

Posner, M. I., Walker, J. A., Friedrich, F. J., & Rafal, R. D. (1984). Effects of parietal injury on covert orienting of

attention. J Neurosci, 4(7), 1863-1874.

Posner, M. I. (1994). Attention: the mechanisms of consciousness. Proc Natl Acad Sci U S A, 91(16), 7398-7403.

Rees, G., E. Wojciulik, K. Clarke, M. Husain, C. Frith and J. Driver (2000). Unconscious activation of visual cortex in

the damaged right hemisphere of a parietal patient with extinction. Brain 123(Pt 8), 1624-33.

Rensink, R. A., O'Regan, J. K., & Clark, J. J. (1997). To see or not to see: the need for attention to perceive

changes in scenes. Psychological Science, 8(5), 368-373.

Rensink, R. A. (2000). Seeing, sensing, and scrutinizing. Vision Res 40(10-12), 1469-87.

Rensink, R. A. (2002). Change detection. Annu Rev Psychol, 53, 245-277.

Page 26: CHAPTER 3. VISUAL ATTENTION AND VISUAL AWARENESS.

26

Ress, D., B. T. Backus and D. J. Heeger (2000). Activity in primary visual cortex predicts performance in a visual

detection task. Nat Neurosci 3(9), 940-5.

Reynolds, J. H., L. Chelazzi and R. Desimone (1999). Competitive mechanisms subserve attention in macaque

areas V2 and V4. J Neurosci 19(5), 1736-53.

Reynolds, J. H. and R. Desimone (1999). The role of neural mechanisms of attention in solving the binding

problem. Neuron 24(1), 19-29, 111-25.

Reynolds, J. H., T. Pasternak and R. Desimone (2000). Attention increases sensitivity of V4 neurons. Neuron 26(3),

703-14.

Riesenhuber, M., & Poggio, T. (1999). Hierarchical models of object recognition in cortex. Nat Neurosci, 2(11),

1019-1025.

Rizzolatti, G., Riggio, L., Dascola, I., & Umilta, C. (1987). Reorienting attention across the horizontal and vertical

meridians: evidence in favor of a premotor theory of attention. Neuropsychologia, 25(1A), 31-40.

Robertson, L. C., Treisman, A., Friedman-Hill, S. R., & Grabowecky, M. (1997). The interaction of spatial and

object pathways: Evidence from Balint's syndrome. J Cog Neuroscience, 9, 295-317.

Robinson, D. L., & Petersen, S. E. (1992). The pulvinar and visual salience. Trends Neurosci, 15(4), 127-132.

Rockland, K. S., & Pandya, D. N. (1979). Laminar origins and terminations of cortical connections of the occipital

lobe in the rhesus monkey. Brain Res, 179(1), 3-20.

Roelfsema, P. R., V. A. Lamme and H. Spekreijse (1998). Object-based attention in the primary visual cortex of

the macaque monkey. Nature 395(6700), 376-81.

Rousselet, G. A., Fabre-Thorpe, M., & Thorpe, S. J. (2002). Parallel processing in high-level categorization of

natural images. Nat Neurosci, 5(7), 629-630.

Salinas, E., & Abbott, L. F. (1997). Invariant visual responses from attentional gain fields. J Neurophysiol, 77(6),

3267-3272.

Simons, D. J., & Levin, D. T. (1998). Failure to detect changes to people during a real-world interaction.

Psychonomic Bulletin & Review, 5(4), 644-649.

Simons, D. J., & Chabris, C. F. (1999). Gorillas in our midst: sustained inattentional blindness for dynamic events.

Perception, 28(9), 1059-1074.

Snyder, L. H., Batista, A. P., & Andersen, R. A. (1997). Coding of intention in the posterior parietal cortex. Nature,

386(6621), 167-170.

Snyder, L. H., Grieve, K. L., Brotchie, P., & Andersen, R. A. (1998). Separate body- and world-referenced

representations of visual space in parietal cortex. Nature, 394(6696), 887-891.

Somers, D. C., A. M. Dale, A. E. Seiffert and R. B. Tootell (1999). Functional MRI reveals spatially specific

attentional modulation in human primary visual cortex. Proc Natl Acad Sci U S A 96(4), 1663-8.

Sperling, G., & Melchner, M. J. (1978). The attention operating characteristic: examples from visual search.

Science, 202(4365), 315-318.

Sperling, G., & Dosher, B. (1986). Strategy and optimization in human information processing. In K. R. Boff & L.

Kaufman & J. P. Thomas (Eds.), Handbook of perception and human performance (pp. 1-65). New York:

Wiley.

Steinmetz, M. A., & Constantinidis, C. (1995). Neurophysiological evidence for a role of posterior parietal cortex

in redirecting visual attention. Cereb Cortex, 5(5), 448-456.

Super, H., H. Spekreijse and V. A. Lamme (2001). Two distinct modes of sensory processing observed in monkey

primary visual cortex (V1). Nat Neurosci 4(3), 304-10.

Tanaka, K. (1996). Inferotemporal cortex and object vision. Annual Review of Neuroscience, 19, 109-139.

Tarr, M. J., & Cheng, Y. D. (2003). Learning to see faces and objects. Trends Cogn Sci, 7(1), 23-30.

Page 27: CHAPTER 3. VISUAL ATTENTION AND VISUAL AWARENESS.

27

Theeuwes, J. and F. L. Kooi (1994). Parallel search for a conjunction of contrast polarity and shape. Vision Res

34(22), 3013-6.

Thompson, K. G., & Schall, J. D. (2000). Antecedents and correlates of visual detection and awareness in

macaque prefrontal cortex. Vision Res, 40(10-12), 1523-1538.

Thorpe, S. J., Fize, D., & Marlot, C. (1996). Speed of processing in the human visual system. Nature, 381, 520-522.

Tootell, R. B., N. Hadjikhani, E. K. Hall, S. Marrett, W. Vanduffel, J. T. Vaughan and A. M. Dale (1998). The

retinotopy of visual spatial attention. Neuron 21(6), 1409-22.

Treisman, A. M., & Gelade, G. (1980). A feature-integration theory of attention. Cognit Psychol, 12(1), 97-136.

Treisman, A., & Schmidt, H. (1982). Illusory conjunctions in the perception of objects. Cognit Psychol, 14(1), 107-

141.

Treisman, A. and S. Gormican (1988). Feature analysis in early vision: Evidence from search asymmetries.

Psychological Review 95, 15-48.

Treisman, A. and S. Sato (1990). Conjunction search revisited. J Exp Psychol Hum Percept Perform 16(3), 459-78.

Treisman, A. (1996). The binding problem. Curr Opin Neurobiol, 6(2), 171-178.

Treisman, A. (1999). Solutions to the binding problem: progress through controversy and convergence. Neuron,

24(1), 105-110, 111-125.

Treue, S. and J. H. Maunsell (1996). Attentional modulation of visual motion processing in cortical areas MT and

MST. Nature 382(6591), 539-41.

Treue, S., & Martinez Trujillo, J. C. (1999). Feature-based attention influences motion processing gain in

macaque visual cortex. Nature, 399(6736), 575-579.

Ungerleider, L. G., & Mishkin, M. (1982). Two cortical visual systems. In D. J. Ingle & M. A. Goodale & R. J. W.

Mansfield (Eds.), Analysis of Visual Behavior (pp. 549-586). Cambridge, MA: MIT Press.

Ungerleider, L. G., & Haxby, J. V. (1994). 'What' and 'where' in the human brain. Curr Opin Neurobiol, 4(2), 157-

165.

Van Essen, D. C., Anderson, C. H., & Felleman, D. J. (1992). Information processing in the primate visual system:

an integrated systems perspective. Science, 255(5043), 419-423.

VanRullen, R., & Thorpe, S. J. (2001). The time course of visual processing: from early perception to decision-

making. J Cogn Neurosci, 13(4), 454-461.

VanRullen, R., & Thorpe, S. J. (2002). Surfing a spike wave down the ventral stream. Vision Res, 42(23), 2593-2615.

VanRullen, R., L. Reddy and C. Koch (2003). Visual search and dual-tasks reveal two distinct attentional

resources. J Cog Neuroscience submitted .

Vidyasagar, T. R. (1998). Gating of neuronal responses in macaque primary visual cortex by an attentional

spotlight. Neuroreport 9(9), 1947-52.

Vogels, R. (1999). Categorization of complex visual images by rhesus monkeys. Part 2: single-cell study. Eur J

Neurosci, 11(4), 1239-1255.

von der Heydt, R., Peterhans, E., & Baumgartner, G. (1984). Illusory contours and cortical neuron responses.

Science, 224(4654), 1260-1262.

Von der Malsburg, C. (1981). The correlation theory of brain function. Gottingen, Germany: Max Planck Institute

for Biophysical Chemistry.

Vuilleumier, P., N. Sagiv, E. Hazeltine, R. A. Poldrack, D. Swick, R. D. Rafal and J. D. Gabrieli (2001). Neural fate of

seen and unseen faces in visuospatial neglect: a combined event-related functional MRI and event-

related potential study. Proc Natl Acad Sci U S A 98(6), 3495-500.

Page 28: CHAPTER 3. VISUAL ATTENTION AND VISUAL AWARENESS.

28

Watanabe, T., A. M. Harner, S. Miyauchi, Y. Sasaki, M. Nielsen, D. Palomo and I. Mukai (1998). Task-dependent

influences of attention on the activation of human primary visual cortex. Proc Natl Acad Sci U S A 95(19),

11489-92.

Weichselgartner, E., & Sperling, G. (1987). Dynamics of automatic and controlled visual attention. Science,

238(4828), 778-780.

Wolfe, J. M. (1984). Reversing ocular dominance and suppression in a single flash. Vision Res, 24(5), 471-478.

Wolfe, J. M. (1992). "Effortless" texture segmentation and "parallel" visual search are not the same thing. Vision

Res, 32(4), 757-763.

Wolfe, J. M. (1998). Visual Search. Attention. H. Pashler. London, UK:, University College London Press, 13-73.

Zeki, S. (2001). Localization and globalization in conscious vision. Annu Rev Neurosci, 24, 57-86.

Zihl, J., von Cramon, D., & Mai, N. (1983). Selective disturbance of movement vision after bilateral brain

damage. Brain, 106(Pt 2), 313-340.

Zipser, K., V. A. Lamme and P. H. Schiller (1996). Contextual modulation in primary visual cortex. J Neurosci

16(22), 7376-89.