Visual Perception
Software Design Context
Last term we looked at:
the EFFECT OFBROWSERS
Design as Practiced
Top Ten Mistakes in Web Design: May 1996/1999
From Jakob Nielsen’s Alertbox: www.useit.com
Visual Perception
These lectures indicated guidelines and mistakes to avoid.
In order to evaluate the usefulness or appropriateness of guidelines some understanding of your working environment is useful – and in ISD much of your working environment comprises people and how they work and view the world.
IS Development is often about guidelines and constraints but there are lots of opinions out there – innumerable guidelines and methodologies to fit (or not) the innumerable situations that commerce, society and government can throw up.
An understanding of visual perception is important becausevirtually everything you design and build will be viewed byyour users on a VDU – engagement with systems is mostlythrough vision and actions resulting from interpretation of visual messages
Visual Perception
first some memory tests followed by some observations:
observe the next 5 slides and remember the values and the positions of the items on the slides
This lecture is about perception, but most ofthe work on perception has been about:
29
15
6
23
51
Waiting time
23
29
6
51
15
23
29
6
51
15
Visual Perception
memory works a lot better if we know what we are supposed to be looking at - instructions may be important to perception
perception appears to be some sort of a filtering mechanism - how much is the brain involved in perception?
Obviously, you all received the same visual stimuli but some parts of the stimuli were easier to remember than others because of the instructions given at the beginning of the test i.e. if I’d asked you to remember the COLOURS as well, you would have.
One more simple test:
MASKING
observe the next set of slides which will be displayed as quickly as the machine can do it
A
U
How many slides were there?
What was on the slides?
Masking:Experiments with a tachistoscope have shown that there is a difference in perception time between simple forms like A, U and T and more complex ones, letters within a ring, or double ring. A
URapid exposure to a second pattern from the same light source can cause the original pattern to be erased – effectively not seen.
2 slides: first showing an A Second showing a U inside a double circle
This is taken to mean that some sort of staging of perception exists. Whatever this actually is, it indicates that the brain isn’t just a receiver for incoming signals which results in vision.
The brain is obviously involved with perception in conjunction with the eyes but equally the stimulus received is also important for understanding perception.
Theories of Visual Perceptionby Ian E. Gordonpublished by John Wiley & Sons, 1989.
Theories must meet certain criteria:
Should offer economical accounts of a range of facts. A theory is not much use if a description of it is a long as that required to describe the relevant phenomena
Should attempt to explain phenomena, or at least suggest causal links between them
Should be testable – should be stated in such a way that deductions can be derived and tested empirically
There is considerable variation in the style and language of theories
of visual perception
The reason for this is essentially that none of them deals with exactly the same arena of perception
The greater the number of regions to be included in a theory the more that theory tends to be general in form
Brain
receptors
stimuli
effectors
ENVIRONMENT
ENVIRONMENT
ENVIRONMENT
ENVIRONMENT
Knowledge of the important properties of stimuli has come mainly from physics
Brain
receptors
stimuli
effectors
ENVIRONMENT
ENVIRONMENT
ENVIRONMENT
ENVIRONMENT
The environment is the physical world of surfaces and objects - the ecology of the organism
Incoming stimuli from objects in the world give rise to events some of which can be detected by perceivers
Sensory surfaces take incoming stimuli and translate them into a neural code: it is important to know the nature of this transduction, how light is absorbed by the eye
Important questions concern the pathways taken by neural messages, the codes which are used to represent differences in quality, intensity and duration
The brain obviously has a role to play
Brain
receptors
stimuli
effectors
ENVIRONMENT
ENVIRONMENT
ENVIRONMENT
ENVIRONMENT
Most behaviour depends upon brain processes but these are commonly not available to direct study and must be explored indirectly
Organisms make explicit responses to stimuli in the environment
Responses:The pupil constricts in response to light and sweat is produced by very brief exposure to taboo words
This can be used as a sign that the words have been detected and that they have induced emotional responses
we move around in the world and in this way partially determine the stimulation we receive
The quickest action any human is capable of is an eye movement
It has been discovered that the eye takes in much less information during an eye movement then when it is stationary
We may make eye movements which are abrupt and ballistic and also movements which are smoothly graded
what guides the selection of appropriate movements?
Brain
receptors
stimuli
effectors
ENVIRONMENT
ENVIRONMENT
ENVIRONMENT
ENVIRONMENT
Regions of interest to perceptual theorists:
THE ENVIRONMENT
INCOMING STIMULATION
RECEPTOR SURFACES AND THE PERIPHERALSENSORY NERVOUS SYSTEM
THE BRAIN
PERIPHERAL EFFECTOR PROCESSES
MOTOR RESPONSES BY THE PERCEIVER
therefore
Brain
receptors
stimuli
effectors
ENVIRONMENT
ENVIRONMENT
ENVIRONMENT
ENVIRONMENT
GESTALT
• Wertheimer (1880-1943)• Koffka (1886-1941)• Köhler (1887-1964)
"why do things look as they do?"
what must be explained by perceptual theories is the stability and coherence of the world of everyday experience
Pencil and hole in page and your nose
Their statement of intent:
GESTALT
Whenever we open our eyes we see objects and surfaces, not sensations of light.
We can easily distinguish between figure and ground (the figure possesses Gestaltqualität) - ground is less distinct.
GESTALT
distinguishing between figure and ground
figure
ground
GESTALT
distinguishing between figure and ground
Differentiation betweenfigure and ground canbe confused in a two-dimensional image, suchas that on a page or ascreen
A white circle or a hole in the black triangle?
But is not usually a problemin a three-dimensional world
GESTALT believed that:
there is a general, underlying principle behind the numerous examples of organisation which they discovered (see next few slides)
Gestalt theorists also laid much emphasis on the simple idea that the whole is greater than the sum of the parts
See also: Peter Checkland’s bookSystems Thinking: Systems Practice
The Muller-Lyer illusion: the individual lines are objectively the same but their relationship with the arrows creates an illusion which could not be predicted from knowledge of the individual components
GESTALTthe whole is greater thanthe sum of the parts
The Muller-Lyer illusion
GESTALT
natural natural organisation:organisation:
grouping by rows
grouping by columns
equal proximity:no dominant direction of grouping
grouping by continuation
grouping by similarity
GESTALT - many of those examples demonstrated the
Law of closeness:Law of closeness:* cow
* bat
* cat
boy *
man *
woman *
animals people
* cow
* bat
* cat
boy *
man *
woman *
GESTALT
Law of enclosure:Law of enclosure:
The Gestalt movement took a phenomenological approach rather than an introspective approach to perception.
GESTALT
Their explanation of perceptual and related phenomena took the form of hypothetical brain processes
these were part of a psycho-neural isomorphism
this is inherently nativist in its implications concerning the origins of perception in the individual perceiver
The philosophy of Emmanuel KANT (1724-The philosophy of Emmanuel KANT (1724-1804)1804) was important to Gestalt Theoristswas important to Gestalt Theorists
he advanced the nativist theory
Empiricists
Richard Gregory born 1923
concluded that perceiving is an activity resemblinghypothesis formation and testing.
Signals received by the sensory receptors trigger neural events
Appropriate knowledge interacts with these inputs, which are often incomplete, to create psychological data.
On the basis of such data, hypotheses are advanced to predict and makes sense of events in the world
This chain of events is the process we call perceiving
Empiricists
The main arguments are:The main arguments are:
perception allows behaviour to be generally appropriate even to non-sense object characteristics
when we “see” only 3 legs on a table
The PENROSE DESIGN
Empiricists
The main arguments are:The main arguments are:
perception allows behaviour to be generally appropriate even to non-sense object characteristics
perception can mediate zero-time delay reactions
when we “see” only 3 legs on a table
perception can be ambiguous: if a single physical pattern can induce 2 different percepts, then perception cannot be tied to stimulation in a one-to-one manner
Neckar cube
The NECKAR CUBE
Even “impossible designs are “rationalised”: perception can be paradoxical
Empiricists
The main arguments are:The main arguments are:
perception can extract familiar objects from a cluttered background
perception seems to be aided by knowledge
but stereotyped, well reinforced knowledge can refute actual perception so that even if we know a hollow mask is hollow we still perceive it as a normal face
atoo talurthrushes searching image
Conclusion:
Empiricists Conclusion:
We receive incomplete sets of data about the world and the visual perception system creates a representation of that world which is essentially a system of models, images or schemata.
Our perception of the world is
INDIRECT through the mediation of these models
Brunswik started the idea that there was a probabilitistic functionalism involved in perception. He considered that cues arriving from the world were not only incomplete but uncertain.
Appropriate use of these cues had survival value - the environment of the organism was important to understanding perception.
Egon Brunswik was part of the FUNCTIONALIST school of perception
Probabilitistic FunctionalismProbabilitistic Functionalism
Brunswik-Reiter schematic faces:
Variation in:
nose length
forehead height
mouth height
eye separation
. .
He arrived at this conclusion after a series ofexperiments involving perceptions of a minimalset of facial characteristics:
Brunswik-Reiter schematic faces:
. .Experimental design:
categorise according to these scales:gay-sad
young-oldgood-bad
likeable-unlikeablebeautiful-ugly
intelligent-unintelligentenergetic-unenergetic
apparent mood and agecharacter, likeability and beauty
intelligence and energy
associations found between:
. . . .
. . . .
Good Happy
Bad Sad
Brunswik-Reiter schematic faces:
. .
Higher the mouth: gayer & younger the face
but the lower the apparent intelligence
longer noses generally had unfavourable effects High foreheads received
favourable judgements
Brunswik-Reiter schematic faces:
. .Conclusion:
strong impressions can be induced by very simple patterns
small changes can induce marked changes in the impressions they induce these impressions are not
particularly culturally biased - they are human stereotypes
Brunswik-Reiter schematic faces:
. .Conclusion:
There is a large amount of constancy, stability in an inherently unstable world{we will later see Edward De Bono’s also asserts that anything that looks like the pattern is presumed to be the pattern and we act accordingly without further thought being involved}
the ecology of the perceiver is important
small changes are important - changes are what stimulate active recognition
DIRECT PERCEPTIONorECOLOGICAL OPTICS
He objected to empiricism, which said:
JJ GIBSON
•We cannot be directly aware of the physical world. Colour, say, resides not in objects but in our heads.
•Perception takes the form of sequential samples. So to achieve unified perceptions we integrate visual input over time.
•Sensory inputs are usually too impoverished to specify external scenes or objects.
•Illusions force us to accept that perception may be non-veridical.
JJ GIBSON
•Incomplete sensory inputs means the perceiver must add to them. This elaboration of sensations involves inferential processes utilising memory, habit, set and so on.
DIRECT PERCEPTIONorECOLOGICAL OPTICS
objections:
•Survival pressures require that inferential processes deliver "correct" solutions most of the time - we successfully go beyond the sensory evidence - but sometimes inferences fail and we experienced illusions or other 'errors' of perception.
•Illusions also confirm the constructional nature of perceiving.
JJ GIBSON
EMPIRICISM v DIRECT PERCEPTION
take the example of size constancy:
Not all the light from the source being observed comes directly to the eye
• rays reflect from a wall to the object then to the eye• others come from the floor and then the object• others from the surface on which it is standing not the object
but when we consider areal scene the problem changes
Empiricism: single ray line drawings are made showing visual angles
JJ GIBSON
The eye is bathed in a sea of radiant energy, of complex interactions between light rays moving in different directions many of which have been reflected by surfaces
EMPIRICISM v DIRECT PERCEPTION
the result is:
The visual world comprises surfaces under illuminationreminiscent of Gestalt?
light travels in straight lines and carries information about the environment through which it has travelled and from which it had been reflected
JJ GIBSON
EMPIRICISM v DIRECT PERCEPTION
basis of direct perception is:
• Light arriving at the eye in real situations is structured
• It is highly complex and potentially rich in information
• A single momentary retinal image may be impoverished, but this is not true when the eyes sweep over a normal scene
JJ GIBSON
the niche occupied by the organism
DIRECT PERCEPTIONorECOLOGICAL OPTICS
HOW the light is likely to be structured is important and is related to:
its ecological optics,which determines a number of INVARIANTS
an animal's perceptual systems can only be understood by considering its lit environment
JJ GIBSON
as an object moves further away its image does get smaller but is bounded by textured surfaces and the grain of this texture gets finer as the object recedes
DIRECT PERCEPTIONorECOLOGICAL OPTICS
When we do this we can understand other things when considering size constancy:
they also obscure a portion of the textured ground against which they are seen
The further away an object is the closer it will be to the horizon
the important point is that objects are not usually judged in complete isolation
JJ GIBSON
they expand as one approaches them and contract as they pass beyond the head
The changes in texture, like other things in our environment, are not random:
this situation will be the case whenever we move towards something
there is a higher-order pattern of structure and this is available as a source of information about the environment
Another example is the law of reversible occlusion, which states that the hidden and unhidden real things can be interchanged by moving around. Going out of sight is not the same as going out of existence.
the flow of the texture is invariant
The essence of invariants is that they are associated with change
JJ GIBSON
"Roughly, the affordances of things are what they furnish, for good or ill, that is, what they afford the observer"
affordance:
these include surfaces that are stand-on-able or sit-on-able, objects that are graspable or throwable, objects that afford hitting, surfaces that afford supporting, substances that afford pouring
this shows the influence of functionalism and accords with the views of Koffka and other Gestalt theorists who stressed the meaningfulness of the perceived world
the set of effectivities available to the organism, or the actions allowed
the originality of Gibson's affordances lies in his claim that they can be perceived directly, without prior synthesis or analysis - the “directness” he is referring to is not universal but refers to knowledge of the environment which is tacit.
David MARR -the computational theory of perception
worked at Cambridge University
His work on artificial intelligence led to numerous papers on perception and finally to his book Vision which was published posthumously (he died age 35 in 1980)
important developments which contributedto his theories: information theory cybernetics construction of large digital computers
MARR MARR
MARR
contributing studies used in developing his theory
this suggests that the visual system analyses visual inputs into specific components, and that the mechanisms which do this are "wired into" the nervous system
it is therefore possible that the perception of certain basic features of the world is unlearned
cat visual cortex cells respond differentially to lines and edges according to the orientation of these stimuli
MARR MARR
MARR
contributing studies used in developing his theory
Random dot stereograms:
a powerful illusion of depth arises because the paired stereograms contain central portions which differ slightly
this triggers normal stereopsis: disparity of left and right views
the strange and wonderful thing about the random dot stereograms is that the disparity is not visible - the arrays contain no hint of form
this proves that the visual system can extract disparity information in the absence of pattern recognition
MARR MARR
MARR
contributing studies used in developing his theory
spatial frequencies of test gratings
staring for a time at a particular grating reduces sensitivity to that grating temporarily[displays containing black and white
stripes of varying number of stripes per degree of visual angle]
this is not a general loss of visual acuity because sensitivity to other spatial frequencies remains unchanged
in the cat visual cortex cells are differentially sensitive to particular spatial frequencies. It would appear that one could consider the acuity of vertebrate visual systems in terms of “tuned” channels
MARR MARR
MARR
computation theory algorithm hardware
Vision must start with the image on the retina
the end point is our awareness of the world
There seems to be a picture of the world available to us whenever we open our eyes and look around. But the fact is that light stops at the retina. There can be no actual pictures in our heads, only neural activity. It follows that this neural activity is representing the world symbolically, and we must therefore strive to understand this symbolic process.
Marr argues that symbolic representations of various aspects of the world, initially obtained from the retinal image, are combined into the descriptions which we call seeing.
MARR MARR
MARR
computation theory algorithm hardware
why is it important to be able to perceive contours - what use is this to the perceiver? In other words, what of importance in the real world correlates with contours in the visual image? Why should the visual system work to make them explicit?
perception of contours
How might contours be represented symbolically in our heads?
matches Brunswik's concern over the ecological validity of cues
it is likely to arise as an edge - a feature which reveals discontinuities between the surfaces of different objects
answer
MARR MARR
MARR
computation theory algorithm hardware
quite a lot is known about contour perception
perception of contours
the information passed on is about rates of change, not homogenous illumination or grading of intensity, which is "extracted" as relevant contour information
excitatory and inhibitory fields in the retina, which respond to entire edges, would be needed for such a mechanism, with rules for interactions of fields
hardware needed for such an algorithm is present in the retinal ganglion cells which act through mutual inhibition
computation theory
algorithm
hardware
a theory in which the main job of vision is to derive a representation of shape
MARR MARR
MARR
computation theory algorithm hardware
He also used his knowledge of computer science to formulate a guiding principle - modular design
in solving computational problems generally, it is wise to break down the computation into component parts which should proceed as independently as possible
image
primal sketch
2½-D sketch
3-D model
MARR MARR
MARR
image primal sketch 2½-D sketch 3-D model
image: the retinal image is a spatial distribution of intensity values
primal sketch: takes raw intensity values of the image and makes explicit certain forms of information contained therein. The most important information concerns the spatial distribution of intensity changes and how they are organised- allows the possible detection of surfaces
MARR MARR
MARR
image primal sketch 2½-D sketch 3-D model
2½-D sketch: orientation and rough depth of visible surfaces are made explicit: it is as if a "picture" of the world is beginning to emerge. However, what is emerging is organised with reference only to the viewer, it is not yet linked to a stable, external environment
3-D model representation: the shapes and their orientation become explicit as tokens of three-dimensional objects organised in an object-centred framework i.e. independent of particular positions and orientations on the retina. By this final stage of vision the perceiver has attained a model of the external world
MARR MARR
MARR
primal sketchfrom the array of intensities (retinal image) certain primitives or place tokens are derived
•zero-crossings•edges•bars•blobs•terminationsConsider a TV commercial using
a reverse zoom:
• initially a group of individuals is seen - they are assigned a visual token each• as the camera rises, zooms backwards, it can be seen that the people form various groupings (a token for each group)• at its greatest height the camera reveals the people as a letter - seeing each letter as a coherent whole implies that it must be represented in the visual system (as a new token)
•edge segments•virtual lines•groups•curvilinear organisation•boundaries
MARR MARR
MARR
primal sketch
During the development of the primal sketch groups of adjacent tokens having a common property, such as orientation, are replaced by "level one" tokens representing this common property. Then, if there are whole groups of similarly oriented level one tokens, these are used to construct boundaries between parts of the full primal sketch.
The question arises as to how the visual primitives of the primal sketch are actually extracted. Let us examine the primitives known as zero-crossing.
MARR MARR
MARR
primal sketch
z
intensity change
first derivative
second derivative
zero-crossings:
Important information about a shape and its orientation comes from edges, contours and boundaries: that is from areas in the image where intensity values are changing rapidly
MARR MARR
MARR
primal sketch
z
zero-crossings:essentially act as spatial frequency filters which may be tuned to different scales to capture information on edges
For example: hold up the handout in front of you and face the side walls. Close your eyes tight and slowly open them a very small amount to view the page. The page will be seen as a bright area against the wall and within this brightness will be a grey area. This grey area on closer inspection will be seen to be a series of grey blocks (paragraphs) and on widening the eyes more the blocks are seen to a series of black and white horizontal stripes and finally the black stripes can be resolved into individual words or letters depending on our level of focus and concentration.
It appears that if several different spatial filters agree on the position of a contour in an image, then an edge in the real world exists.
A series of filters (zero-crossings) have been processed
MARR MARR
MARR
primal sketch
z
zero-crossings:
there are cells in the retina and the lateral geniculate nucleus which exhibit receptive field properties - activity of the cells can be shown to reflect patterns of stimulation of groups of retinal cells
the mechanism:
receptive fields are organised in various ways and shapes
zero-crossing primitives correspond to On-centre and
Off-surround receptive fields:
The Hermann grid
in this way one can discover which region of retinal receptors is connected to that neuron because patterns projected on one part of the screen will make the cell respond with a stream of impulses
The Hermann grid
micro-electrodes inserted into the visual pathway can pick up impulses from a single neuron
the area of the screen that causes the neuron to respond is called its receptive field
if parts of a neuron's receptive field are illuminated, that cell gives a burst of impulses either when the light turns on or when it turns off.
for about 50% of ganglion cells, light falling in the very centre of the receptive field gives on-responses while a spot of light limited to a surrounding area actually suppresses activity of the cell while it is turned on and causes an off-response when the light is extinguished.
large signal small signal
Explanation of Hermann’s grid using the receptive field and lateral inhibition theory
Illumination of both the central and surrounding regions of the receptive field by a large spot of light causes a much weaker response than light on the centre alone
appears white because an “on” receptive field is stimulated by the white bar but the dark squares are not stimulating “off” fields. At the intersection, the white cross stimulates “on” and “off” receptive fields, giving the grey shade.
large signal small signal
The Hermann grid
• First, the retina is telling the brain mainly about the beginning and end of each retinal illumination
• Secondly, localised illumination is much better than diffuse light. Ganglion cells detect changes in the level of light - differences of illumination in time or space
conclusion from the Hermann Grid:
If it is difficult to believe that nothing stationery is visible, it is even harder to admit that we simply cannot see uniform areas of light and dark except by virtue of their edges, and yet this is certainly true.
It is even possible to fool the brain into thinking that two identical areas differ in brightness simply by creating an apparent edge between them
Only the transition edge can be detected by our ganglion cells, but we perceive the pattern as if the intensity on each side continued uniformly away from the edge.
Global village a weekly posting from cyberspace
Announcing the new Built-in Orderly Organised Knowledge device
The BOOK is an evolutionary breakthrough in technology:no wires, no batteries, nothing to be connected or switched on. It's so easy to use even a child can operate it. Just lift its cover. Compact and portable, it can be used anywhere, yet it is powerful enough to hold as much information as a CD-ROM disc.
Each BOOK is constructed of sequentially numbered sheets of paper (recyclable), each capable of holding thousands of bits of information. Each sheet is scanned optically, registering information directly into your brain. A flick of the finger takes you to the next sheet. The BOOK never crashes and never needs rebooting. The "browse" feature allows you to move instantly to any sheet.
Many come with an "index" feature, which pinpoints the exact location of any selected information for instant retrieval.
An optional "Bookmark" accessory allows you to open the BOOK to the exact place you left it in a previous session. The BOOK is ideal for long-term archive use. Several field trials have proved that the medium will still be readable in several centuries' time. You can also make personal notes next to BOOK text entries with an optional programming tool, the Portable Erasable Nib Cryptic Intercommunication Language Stylus (PENCILS). The BOOK's appeal seems so certain that thousands of content creators have committed to the platform.
BOOK
Thanks to Paul Templar for this:
Top Related