Uncertaintyinvisualprocessespredictsgeometricalopticalillu...

23
Uncertainty in visual processes predicts geometrical optical illusions q Cornelia Fermuller a, * , Henrik Malm b a Department of Computer Science, Computer Vision Laboratory, Center for Automation Research, Institute for Advanced Computer Studies, University of Maryland, College Park, MD 20742-3275, USA b Department of Mathematics (LTH), Mathematical Imaging Group (MIG), Lund Institute of Technology/Lund University, P.O. Box 118, S-221 00 Lund, Sweden Received 8 June 2001; received in revised form 25 September 2003 Abstract It is proposed in this paper that many geometrical optical illusions, as well as illusory patterns due to motion signals in line drawings, are due to the statistics of visual computations. The interpretation of image patterns is preceded by a step where image features such as lines, intersections of lines, or local image movement must be derived. However, there are many sources of noise or uncertainty in the formation and processing of images, and they cause problems in the estimation of these features; in particular, they cause bias. As a result, the locations of features are perceived erroneously and the appearance of the patterns is altered. The bias occurs with any visual processing of line features; under average conditions it is not large enough to be noticeable, but illusory patterns are such that the bias is highly pronounced. Thus, the broader message of this paper is that there is a general uncertainty principle which governs the workings of vision systems, and optical illusions are an artifact of this principle. Ó 2003 Elsevier Ltd. All rights reserved. Keywords: Optical illusions; Motion perception; Bias; Estimation processes; Noise 1. Introduction Optical illusions are fascinating to almost everyone, and recently with a surge in interest in the study of the mind, they have been very much popularized. Some optical illusions, such as the distortion effects in archi- tectural structures of large extent, or the moon illusion, have been known since the antiquity. Some illusions, such as the Muller-Lyer illusion or the Penrose triangle, by now are considered classic and are taught in schools. Many new illusory patterns have been created in the last few years. Some of these are aesthetically pleasing variations of known effects, but others introduced new effects, prominently in motion and lightness. Scientific work on optical illusions started in the 19th century, when scientists engaged in systematically studying perception, and since then there has been an enduring interest. What is it that has caused this long- standing effort? Clearly, they reveal something about human limitations and by their nature are obscure and thus fascinating. But this has not been the sole reason for scientific interest. For theorists of perception they have been used as test instruments for theory, an effort that originated from the founders of the Gestalt school. An important strategy in finding out how correct per- ception operates is to observe situations in which mis- perception occurs. Any theory, to be influential, must be consistent with the facts of correct perception but also must be capable of predicting the failures of the per- ceptual system. In the past the study of illusions has mostly been carried out by psychologists who have tried to gain insight into the principles of perception by carefully altering the stimuli and testing the changes in visual performance. In recent decades they have been joined by scientists of other mind-related fields such as neurology, physiology, philosophy, and the computa- tional sciences, examining the problem from different viewpoints with the use of different tools (Gillam, 1998; Palmer, 1999). The best known and most studied of all illusions are the geometrical optical illusions. The term is a transla- tion of the German geometrisch-optische Tauschungen and has been used for any illusion seen in line drawings. q The support of this research by the National Science Foundation under grant IIS-00-8-1365 is gratefully acknowledged. * Corresponding author. E-mail address: [email protected] (C. Fermuller). 0042-6989/$ - see front matter Ó 2003 Elsevier Ltd. All rights reserved. doi:10.1016/j.visres.2003.09.038 Vision Research 44 (2004) 727–749 www.elsevier.com/locate/visres

Transcript of Uncertaintyinvisualprocessespredictsgeometricalopticalillu...

Page 1: Uncertaintyinvisualprocessespredictsgeometricalopticalillu ...users.umiacs.umd.edu/~fer/postscript/geo_journal.pdf · opticalillusions,andalsobyextension,illusorypatterns due to motion

Vision Research 44 (2004) 727–749

www.elsevier.com/locate/visres

Uncertainty in visual processes predicts geometrical optical illusions q

Cornelia Ferm€uller a,*, Henrik Malm b

a Department of Computer Science, Computer Vision Laboratory, Center for Automation Research, Institute for Advanced Computer Studies,

University of Maryland, College Park, MD 20742-3275, USAb Department of Mathematics (LTH), Mathematical Imaging Group (MIG), Lund Institute of Technology/Lund University,

P.O. Box 118, S-221 00 Lund, Sweden

Received 8 June 2001; received in revised form 25 September 2003

Abstract

It is proposed in this paper that many geometrical optical illusions, as well as illusory patterns due to motion signals in line

drawings, are due to the statistics of visual computations. The interpretation of image patterns is preceded by a step where image

features such as lines, intersections of lines, or local image movement must be derived. However, there are many sources of noise or

uncertainty in the formation and processing of images, and they cause problems in the estimation of these features; in particular,

they cause bias. As a result, the locations of features are perceived erroneously and the appearance of the patterns is altered. The bias

occurs with any visual processing of line features; under average conditions it is not large enough to be noticeable, but illusory

patterns are such that the bias is highly pronounced. Thus, the broader message of this paper is that there is a general uncertainty

principle which governs the workings of vision systems, and optical illusions are an artifact of this principle.

� 2003 Elsevier Ltd. All rights reserved.

Keywords: Optical illusions; Motion perception; Bias; Estimation processes; Noise

1. Introduction

Optical illusions are fascinating to almost everyone,

and recently with a surge in interest in the study of themind, they have been very much popularized. Some

optical illusions, such as the distortion effects in archi-

tectural structures of large extent, or the moon illusion,

have been known since the antiquity. Some illusions,

such as the M€uller-Lyer illusion or the Penrose triangle,

by now are considered classic and are taught in schools.

Many new illusory patterns have been created in the last

few years. Some of these are aesthetically pleasingvariations of known effects, but others introduced new

effects, prominently in motion and lightness.

Scientific work on optical illusions started in the 19th

century, when scientists engaged in systematically

studying perception, and since then there has been an

enduring interest. What is it that has caused this long-

standing effort? Clearly, they reveal something about

qThe support of this research by the National Science Foundation

under grant IIS-00-8-1365 is gratefully acknowledged.* Corresponding author.

E-mail address: [email protected] (C. Ferm€uller).

0042-6989/$ - see front matter � 2003 Elsevier Ltd. All rights reserved.

doi:10.1016/j.visres.2003.09.038

human limitations and by their nature are obscure and

thus fascinating. But this has not been the sole reason

for scientific interest. For theorists of perception they

have been used as test instruments for theory, an effortthat originated from the founders of the Gestalt school.

An important strategy in finding out how correct per-

ception operates is to observe situations in which mis-

perception occurs. Any theory, to be influential, must be

consistent with the facts of correct perception but also

must be capable of predicting the failures of the per-

ceptual system. In the past the study of illusions has

mostly been carried out by psychologists who have triedto gain insight into the principles of perception by

carefully altering the stimuli and testing the changes in

visual performance. In recent decades they have been

joined by scientists of other mind-related fields such as

neurology, physiology, philosophy, and the computa-

tional sciences, examining the problem from different

viewpoints with the use of different tools (Gillam, 1998;

Palmer, 1999).The best known and most studied of all illusions are

the geometrical optical illusions. The term is a transla-

tion of the German geometrisch-optische T€auschungenand has been used for any illusion seen in line drawings.

Page 2: Uncertaintyinvisualprocessespredictsgeometricalopticalillu ...users.umiacs.umd.edu/~fer/postscript/geo_journal.pdf · opticalillusions,andalsobyextension,illusorypatterns due to motion

728 C. Ferm€uller, H. Malm / Vision Research 44 (2004) 727–749

It was coined by Oppel (1855) in a paper about the

overestimation of an interrupted as compared with an

uninterrupted extent, later called the Oppel–Kundt

illusion (Kundt, 1863). Some other famous illusions in

this class include the M€uller-Lyer (M€uller-Lyer, 1896),Poggendorff, the Z€ollner illusions (Z€ollner, 1860), theFraser spiral (Fraser, 1908), and the contrast effect

(Oyama, 1960). The number of illusory patterns that fallin this class is very large, and the perceptual phenomena

seem to be quite diverse. This is reflected in a cornucopia

of explanations one can find in the literature, most of

them concerned with only one, or a small number of

illusions (Robinson, 1972).

In this paper we propose a theory that predicts a large

number of geometrical optical illusions. This theory

states that the statistics of visual computations is thecause or one of the major causes underlying geometrical

optical illusions, and also by extension, illusory patterns

due to motion signals in line drawings. In a nutshell,

when interpreting a pattern, features in the image such

as lines, intersections of lines, or local image motion

must be derived, that is, they must be estimated from the

input data. Because of noise, systematic errors occur in

the estimation of the features; in statistical terms we saythe estimations are biased. As a result, the locations of

features are perceived erroneously and the appearance

of the pattern is altered. The bias occurs with any visual

processing of features; under average conditions it is not

large enough to be noticeable, but illusory patterns are

such that the bias is strongly pronounced.

In somewhat more detail, the proposed theory is as

follows: our eyes receive as input a sequence of images.The early visual processing apparatus extracts from the

images local image measurements. We consider three

kinds: the intensity value of image points; small edge

elements (edgels); and image motion perpendicular to

local edges (normal flow). These image measurements

can only be derived within a range of accuracy. In other

words, there is noise in the intensity of image points, in

the positions and orientations of edge elements, and inthe directions and lengths of normal flow vectors. The

first interpretation processes estimate local edges from

image intensities, intersections of lines from edgels, and

local 2D image motion from normal flow measurements.

These estimation processes are biased. Thus the per-

ceived positions of edgels are shifted, their directions are

tilted, and the intersection of edges and the image

movement are estimated wrongly. The local edgel andimage motion estimates serve as input to the next

higher-level interpretation processes. Long straight lines

or general curves are fitted to the edgels and this gives

rise to tilted and displaced straight lines and distorted

curves as perceived in many illusory patterns. In the case

of motion, the local image measurements are combined

in segmentation and 3D motion estimation processes,

and because of largely different biases in separated re-

gions, this gives rise to the perception of different mo-

tions.

The noise originates from a variety of sources. First,

there is uncertainty in the images perceived on the retina

of an eye because of physical limitations; the lenses

cause blurring and there are errors due to quantization

and discretization. There is uncertainty in the position

since images taken at different times need to be com-bined, and errors occur in the geometric compensation

for location. Even if we view a static pattern our eyes

perform movements (Carpenter, 1988) and gather a

series of images (either by moving the eyes freely over

the pattern or by fixating at some point on it). Next,

these noisy images have to be processed to extract edges

and their movement. This is done through some form of

differentiation process, which also causes noise. Evi-dence suggests that in the human visual system orien-

tation-selective cells in the cortex respond to edges in

different directions (Blasdel, 1992; Hubel & Wiesel,

1961, 1968), and thus errors occur due to quantization.

Because of these different sources, there is noise or

uncertainty in the image data used in early visual pro-

cesses, that is, in the image intensity values and their

differences in space time, i.e., the spatial and temporalderivatives.

Other authors have discussed uncertainty in mea-

surements before, and argued that optical or neural blur

are a cause of some geometrical illusions (Ginsburg,

1975, 1984; Glass, 1970; Grossberg & Mingolla, 1985a,

1985b). Most related to our work are the seminal studies

of Morgan and coworkers (Morgan, 1999; Morgan &

Casco, 1990; Morgan &Moulden, 1986) and subsequentstudies by others (Bulatov, Bertulis, & Mickiene, 1997;

Earle & Maskell, 1993) which propose models of band-

pass filtering to account for a number of illusions. These

studies invoked in intuitive terms the concept of noise,

since band-pass filtering also constitutes a statistical

model of edge detection in noisy gray-level images. This

will be elaborated in the next section. However, the

model of band-pass filtering is not powerful enough toexplain the estimation of features, different from edges.

For this we need to employ point estimation models.

Thus, the theme of our study is that band-pass filtering

is a special case of a more general principle––namely,

uncertainty or noise causes bias in the estimation of

image features––and this principle accounts for a large

number of geometrical optical illusions that previously

have been considered unrelated.We should stress here, that we use the term bias in the

statistical sense. In the psychophysical literature the

term has been used informally to refer to consistent

deviations from the veridical, but not with the meaning

of an underlying cause.

Bias in the statistical sense means, we have available

noisy measurements and we use a procedure––which we

call the estimator––to derive from these measurements a

Page 3: Uncertaintyinvisualprocessespredictsgeometricalopticalillu ...users.umiacs.umd.edu/~fer/postscript/geo_journal.pdf · opticalillusions,andalsobyextension,illusorypatterns due to motion

C. Ferm€uller, H. Malm / Vision Research 44 (2004) 727–749 729

quantity, let us call it parameter x. Any particular small

set of measurements leads to a different estimated value

for parameter x. Assume we perform the estimation of xusing different sets of measurements many times. The

mean of the estimates of x (that is, the average of an

infinite number of values) is called the expected value of

x. If the expected value is equal to the true value, the

estimate is called unbiased, otherwise it is biased. In theinterpretation of images a significant amount of data is

used. Features are extracted by means of estimation

processes, for which the mean and the bias are charac-

teristics. This justifies the use of the bias in analysing the

perception of features.

The following three sections provide a detailed anal-

ysis of the bias in the estimation of the three basic fea-

tures, the line, the point and the movement of points.(The line and point are the elementary units of the plane,

thus they should be the basic features of static images;

the movement of points is the elementary unit of se-

quences of images.) In particular, Section 2 models the

estimation of edgels from gray values, Section 3 models

the estimation of points as intersection of edgels and

Section 4 models the estimation of optic flow from im-

age derivatives. For each model we discuss a number ofillusions that are best explained by it. We should

emphasize that our goal is to model general computa-

tions, but not the specifics of the human vision system.

Our vision system probably uses for many interpretation

processes different kinds of data. The estimators which

are analyzed are linear procedures as these constitute the

simplest ways to estimate features in the absence of

knowledge about the scene, but we will discuss in Sec-tion 5 that other more elaborate estimation processes,

assuming the noise parameters are not known, are

biased as well. The final Section 6 discusses the rela-

tionship to other theories of illusions, and discusses that

the bias is a general problem of estimation from noisy

data, and thus it affects other visual computations as

well.

2. Bias in edge elements

Consider viewing a static scene such as the pattern in

Fig. 2. Let the irradiance signal coming from the sceneparameterized by image position ðx; yÞ be Iðx; yÞ. Theimage received on the retina can be thought of as a noisy

version of the ideal signal. There are two kinds of noise

sources to be considered. First, there is noise in the value

of the intensity. Assuming this noise is additive, inde-

pendently and identically distributed, it does not effect

the location of edges. Second, there is noise in the spatial

location. In other words there is uncertainty in the po-sition––the ideal signal is at location ðx; yÞ in the image,

the noisy signal with large probability is at ðx; yÞ, butwith smaller probability it could also be at location

ðxþ dx; y þ dyÞ. Let the error in position have a

Gaussian probability distribution. The expected value of

the image then is obtained by convolving the ideal signal

with a Gaussian kernel gðx; y; rpÞ with rp the standard

deviation of the positional noise, that is the expected

intensity at an image point amounts to

EðIðx; yÞÞ ¼ Iðx; yÞHgðx; y; rpÞ

Gaussian smoothing of static images has been inten-

sively studied in the literature on linear scale space

(Koenderink, 1984; Lindeberg, 1994; Witkin, 1983;

Yuille & Poggio, 1986), and we can apply the theoretical

results derived there.

Edge detection mathematically amounts to localizingthe extrema of the first-order derivatives (Canny, 1986)

or the zero crossings of second-order derivatives (the

Laplacian) (Marr & Hildreth, 1980) of the image

intensity function. We are interested in the positions of

edges, or the change in positions of edges with variation

of the smoothing parameter. Lindeberg (1994) derived

formulae for that change in edge position or equiva-

lently the instantaneous velocity of edge points in thedirection normal to edges, which he called the drift

velocity. We will refer to it as edge displacement.

Consider at every edge point P0 a local orthonormal

coordinate system ðu; vÞ with the v-axis parallel to the

spatial gradient direction at P0 and the u-axis perpen-

dicular to it. If edges are given as the zero crossings of

the Laplacian the edge displacement ðotu; otvÞ (where tdenotes the scale parameter) amounts to

ðotu; otvÞ ¼ � r2ðr2IÞ2ððr2IuÞ2 þ ðr2IvÞ2Þ

ðr2Iu;r2IvÞ ð1Þ

For a straight edge, where all the directional derivatives

in the u-direction are 0, it simplifies to

ðotu; otvÞ ¼ � 1

2

IvvvvIvvv

ð0; 1Þ ð2Þ

A similar formula is derived in (Lindeberg, 1994) for

edges defined as extrema of first-order derivatives. The

edge displacement represents the tendency of the

movement of edges in scale space. If the scale interval is

small the edge displacement in the smoothed image

provides a sufficient approximation to the total dis-

placement of the edge, and this is what we will show inlater illustrations.

The scale space behavior of straight edges is illus-

trated in Fig. 1. There are three kinds of edges: edges of

type (a) between a dark and a bright region which do

not change location under scale space smoothing. Edges

of type (b) at the boundaries of a bright line, or bar, in a

dark region (or, equivalently, a dark line in a bright

region) which drift apart, assuming the smoothingparameter is large enough that the whole bar affects the

edges. Edges of type (c) at the boundary of a line of

medium brightness next to a bright and a dark region

Page 4: Uncertaintyinvisualprocessespredictsgeometricalopticalillu ...users.umiacs.umd.edu/~fer/postscript/geo_journal.pdf · opticalillusions,andalsobyextension,illusorypatterns due to motion

Fig. 2. (a) Illusory pattern: ‘‘spring’’ (from Kitaoka, 2003). (b) Small

part of the figure to which (c) edge detection, (d) Gaussian smoothing,

and (e) smoothing and edge detection have been applied.

(a) (b) (c)

Fig. 1. A schematic description of the behavior of edge movement in

scale space. The first row shows the intensity functions of the three

different edge configurations, and the second row shows the profiles of

the (smoothed) functions with the dots denoting the location of edges:

(a) no movement, (b) drifting apart, and (c) getting closer.

730 C. Ferm€uller, H. Malm / Vision Research 44 (2004) 727–749

which move toward each other. These observations

suffice to explain a number of illusions.

The figure in 2a (Kitaoka, 2003) shows a black square

grid on a white background with small black squares

superimposed. It gives the impression of the straight grid

lines being concave and convex curves. The effect can beexplained using the above observation. The grid consists

of lines (or bars), and the effect of smoothing on the bars

is to drift the two edges (of type (b)) apart. At the

locations, however, where a square is aligned with the

grid, there is only one edge (type (a)), and this edge stays

in place. The net effect of smoothing is that edges of grid

lines are no longer straight as is illustrated. Fig. 2b

shows a small part of the figure magnified. The blacksquares in the center of the grid all have been removed

for clarity, as they do not notably affect the illusory

perception. Fig. 2c shows the results of edge detection

on the raw image using the Laplacian of a Gaussian

(LoG). Fig. 2d shows the smoothed image which results

from filtering with a Gaussian with standard deviation

5/4 times the width of the bars and Fig. 2e shows the

result of edge detection on the smoothed image using aLoG. (This is clearly the same as performing edge

detection on the raw image with a LoG of larger stan-

dard deviation.)

Fig. 3a shows an even more impressive pattern from

(Kitaoka, 2003): a black and white checkerboard with

little white squares superimposed in corners of the black

tiles close to the edges, which gives the impression of

wavy lines. In this pattern, next to the white squaresshort bars are created––a white area (from a little

square) next to a black bar (from a black checkerboard

tile) next to a white area (from a white checkerboard

tile). The edges of these bars (edges of type (b)) drift

apart under smoothing. The other edges (of type (a))––

between the black and white tiles of the checkerboard––

stay in place. As a result the edges near the locations of

the white squares appear bumped outward toward the

white checkerboard tiles. This is illustrated in Fig. 3bwhich shows the combined effect of smoothing and edge

detection for a part of the pattern. Fig. 3c and d zoom in

on the edge movement.

Another illusory pattern in this category is the ‘‘caf�ewall’’ illusion shown in Fig. 4a. It consists of a black and

white checkerboard pattern with alternate rows shifted

one half-cycle and with thin mortar lines, mid-way

in luminance between the black and white squares,

Page 5: Uncertaintyinvisualprocessespredictsgeometricalopticalillu ...users.umiacs.umd.edu/~fer/postscript/geo_journal.pdf · opticalillusions,andalsobyextension,illusorypatterns due to motion

Fig. 3. (a) Illusory pattern: ‘‘waves’’ (from Kitaoka, 2003). (b) The result of smoothing and edge detection on a part of the pattern. (c, d) The drift

velocity at edges in the smoothed image logarithmically scaled for parts of the pattern.

Fig. 4. (a) Caf�e wall illusion. (b) Small part of the figure. (c) Result of smoothing and edge detection. (d) Zoom-in on the edge movement.

C. Ferm€uller, H. Malm / Vision Research 44 (2004) 727–749 731

Page 6: Uncertaintyinvisualprocessespredictsgeometricalopticalillu ...users.umiacs.umd.edu/~fer/postscript/geo_journal.pdf · opticalillusions,andalsobyextension,illusorypatterns due to motion

732 C. Ferm€uller, H. Malm / Vision Research 44 (2004) 727–749

separating the rows. At the locations where a mortar

line borders both a dark tile and a bright tile the two

edges move toward each other, and for thin lines it takes

a relatively small amount of smoothing for the two

edges to merge into one. Where the mortar line is be-

tween two bright regions or where it is between two dark

regions the edges move away from each other. The re-

sults of smoothing and edge detection are shown in Fig.4c for a small part of the pattern as in Fig. 4b. The

movement of edges under scale space smoothing is

illustrated in Fig. 4d.

We can counteract the effect of bias by introducing

additional elements as shown in Fig. 5a; the additional

white and black squares put in the corners of the tiles

greatly reduce the illusory effect. As illustrated in Fig. 5c

the inserted squares partly compensate for the driftingof edges in opposite directions. As a result slightly wavy

edgels are obtained; but the ‘‘waviness’’ is too weak to

be perceived (low amplitude, high frequency).

A full account of the perception of lines in the above

illusions requires additional explanation. The lines are

derived in two (or more) processing stages. In the first

stage local edge elements are computed which are tilted

because of bias. The second stage consists of the inte-gration of these local elements into longer lines. Our

hypothesis is that this integration is computationally an

approximation of the longer lines using as input the

Fig. 5. Modified caf�e wall pattern. The additional black and white squares

effect.

positions and orientations of the edge elements. If the

linking of edge elements is carried out this way tilted

lines will be computed in the caf�e wall pattern and

curved lines will be derived in the pattern ‘‘waves.’’ Line

fitting possibly could be realized as smoothing in ori-

entation space (Morgan & Hoptopf, 1989) implemented

in a multi-resolution architecture. At every resolution

the average of the directions of neighboring elements iscomputed, and all the computations are local. In the

case of general curves increasingly larger segments of

increasingly higher complexity could be fitted to smaller

segments; computationally it amounts to a form of

spline fitting.

An integration process of this form also explains one

of the most forceful of all illusions, the Fraser spiral

pattern, which consists of circles made of black andwhite elements which together form something rather

like a twisted cord, on a checkerboard background. The

twisted cord gives the perception of being spiral-shaped,

rather than a set of circles. The individual black and

white elements which make up the cord are sections of

spirals, thus also the edges at the borders of the black

and white lines are along theses directions and the

approximation process will fit spirals to them.The concept of blurring has been invoked before by

several authors as an explanation of some geometrical

optical illusions (Chiang, 1968; Ginsburg, 1984; Glass,

change the edges in the filtered image, which counteracts the illusory

Page 7: Uncertaintyinvisualprocessespredictsgeometricalopticalillu ...users.umiacs.umd.edu/~fer/postscript/geo_journal.pdf · opticalillusions,andalsobyextension,illusorypatterns due to motion

C. Ferm€uller, H. Malm / Vision Research 44 (2004) 727–749 733

1970). In particular, the caf�e wall illusion has been ex-

plained by means of band-pass filtering in the visual

system (Earle & Maskell, 1993; Morgan & Moulden,

1986). Fraser (1908) already related the effect in the

M€unsterberg illusion to his own twisted cord phenom-

enon. Morgan and Moulden (1986) showed that a pat-

tern like the twisted cords is revealed in the mortar lines

if the caf�e wall figure is processed with a band-passspatial frequency filter (smoothed Laplacians and dif-

ference of Gaussians). The cords consist of the peaks

and troughs (maxima and minima) in the filtered image.

We referred to the zero crossings. But essentially our

explanation for the cause in the tilt in the edgels in the

caf�e wall illusion is not different from the one in (Mor-

gan & Moulden, 1986). This is, because the effect of

noise in gray values on edge detection is computation-ally like band-pass filtering.

Within our framework the interpretation of band-

pass filtering is different. We do not say that edge

detection is carried out by band-pass filtering, although

this may be the case. We say that the expected image

(that is the image estimated from noisy input) is like a

Fig. 6. The fine line as shown in A appears to be bent in the vicinity of

the broader black line, as indicated in exaggeration in B (from

Helmholtz, 1962, Chap. 28).

Fig. 7. (a) A line intersecting a bar at an angle of 15�. (b) The image

has been smoothed and the maxima of the gray-level function have

been detected and marked with stars. (c) Magnification of intersection

area.

smoothed image, and edges in smoothed images are

biased. In other words, their location does not corre-

spond to the location in the perfect image. It does not

matter what the source of the noise, and it does not

matter how edges are computed, with Laplacians or as

maxima of first-order derivatives.

The perceptual effect at intersecting lines is illustrated

in Fig. 6. It can be shown with the model introduced inthis section that the intersection point of two lines which

intersect at an acute angle is displaced. The effect is

obtained by smoothing the image and then detecting

edges using non-maximum suppression (see Fig. 7). A

more detailed analysis of the behavior of intersecting

lines is the topic of the next section.

3. Bias in intersection points

There is a large group of illusions in which lines

intersecting at angles, particularly acute angles, are a

decisive factor in the illusion. Wundt (1898) drew

attention to this; acute angles are overestimated, and

obtuse angles are slightly underestimated (although

regarding the latter there has been controversy). Wepredict that these phenomena are due to the bias in the

estimation of the intersection point.

We adopt in this section a slightly different noise

model, with the noise being defined directly on the edge

elements. Noise in gray-level values results in noise in

the estimated edge elements, but also the differentiation

process creates noise. The problem of finding the inter-

section points then can be formulated as solving a sys-tem of linear equations. This allows for a clean analysis

of the influences of the different parameters on the

solutions, and thus provides a powerful predictive

model.

Consider the input to be edge elements, parameter-

ized by the image gradient (a vector in the direction

normal to the edge) ðIx; IyÞ and the position of the center

of the edge element ðx0; y0Þ. The edge elements are noisy(see Fig. 8a). There is noise in the position (which as will

be shown, however, does not contribute to the bias) and

there is noise in the orientation. To obtain the inter-

section of straight lines, imagine a line through every

edge element, and compute the point closest to all the

lines (Fig. 8b).

In algebraic terms: consider additive, independently

and identically distributed (i.i.d.) zero-mean noise in theparameters of the edgels. In the sequel unprimed letters

are used to denote estimates, primed letters to denote

actual values, and d’s to denote errors, where Ix ¼ I 0xþdIx, Iy ¼ I 0y þ dIy , x0 ¼ x00 þ dx0 and y0 ¼ y00 þ dy0.

For every point ðx; yÞ on the lines the following

equation holds:

I 0xxþ I 0yy ¼ I 0xx00 þ I 0yy

00 ð3Þ

Page 8: Uncertaintyinvisualprocessespredictsgeometricalopticalillu ...users.umiacs.umd.edu/~fer/postscript/geo_journal.pdf · opticalillusions,andalsobyextension,illusorypatterns due to motion

Fig. 8. (a) The inputs are edge elements parameterized by the position of their centers ðx0i ; y0i Þ and the image gradient ðIxi ; Iyi Þ. (b) The intersection of

straight lines is estimated as the point closest to all the ‘‘imaginary’’ lines passing through the edge elements.

734 C. Ferm€uller, H. Malm / Vision Research 44 (2004) 727–749

This equation is approximated by the measurements.

Let n be the number of edge elements. Each edgel

measurement i defines a line given by the equation

Ixixþ Iyi y ¼ Ixix0i þ Iyi y0i ð4Þand we obtain a system of equations which is repre-

sented in matrix form as

Is~x ¼~c

Here Is is the n� 2 matrix which incorporates the data

in the Ixi and Iyi , and~c is the n-dimensional vector with

components Ixix0i þ Iyi y0i . The vector ~x denotes the

intersection point whose components are x and y. Thesolution to the intersection point using standard least

square (LS) estimation is given by

~x ¼ ðITs IsÞ�1ITs~c ð5Þ

where superscript T denotes the transpose of a matrix. It

is well known that the LS solution to a linear system of

the form A~x ¼~b with errors in the measurement matrix

A is biased (Fuller, 1987). The statistics for the case ofi.i.d. noise in the parameters of A and~b can be looked up

in books. Our case is slightly different, as ~b is the

product of terms in A and two other noisy terms.

To simplify the analysis, let the variance of the noise

in the spatial derivatives in the x- and y-directions be thesame, let it be r2

s . Assuming the expected values of

higher- (than second) order terms to be negligible, the

expected value of ~x is found by developing (5) into asecond-order Taylor expansion at zero noise (as derived

in Appendix A). It converges in probability to

limn!1

Eð~xÞ ¼~x0 þ r2s lim

n!1

1

nM 0

� �� ��1

ð~�x00 �~x0Þ ð6Þ

where

M 0 ¼ I 0Ts I 0s ¼Pn

i¼1 I02xi

Pni¼1 I

0xiI 0yiPn

i¼1 I0xiI 0yi

Pni¼1 I

02yi

" #

~x0 is the actual intersection point, ~�x00 ¼1n

Pni¼1 x

00i

1n

Pni¼1 y

00i

� �is

the mean of the ~x00i , and n denotes the number of edge

elements.

Using (6) allows for an interpretation of the bias. The

estimated intersection point is shifted by a term which is

proportional to the product of matrix M 0�1

and the

difference vector ð~�x00 �~x0Þ. Vector ð~�x00 �~x0Þ extends fromthe actual intersection point to the mean position of the

edge elements. Thus it is the mass center of the edgels

that determines this vector. M 0 depends only on the

spatial gradient distribution. As a real symmetric matrix

its two eigenvectors are orthogonal, and the direction of

the eigenvector of the larger eigenvalue is dominated

by the major direction of gradient measurements. M 0�1

has the same eigenvectors as M 0 and inverse eigenvaluesand therefore, the influence of M 0�1

is strongest in the

direction of the smallest eigenvalue of M 0. Consider the

case of two intersecting lines, and thus two gradient

directions; the effect of M 0�1is more bias in the direction

of fewer image gradients and less bias in the direction of

more gradients. This means more displacement of the

intersection point in the direction perpendicular to the

line with fewer edge elements.Fig. 9a shows the most common version of the Pog-

gendorff illusion (as described by Z€ollner, 1860). The

upper-left portion of the interrupted, tilted straight line

in this figure is apparently not the continuation of the

lower portion on the right, but is too high. Another

version of this illusion is shown in Fig. 9b. Here it ap-

pears that the middle portion of the inclined (inter-

rupted) line is not in the same direction as the two outerpatterns, but is turned clockwise with respect to them.

Referring to Fig. 9a, the intersection point of the left

vertical with the upper tilted line is moved up and to the

left, and the intersection point of the right vertical with

the lower tilted line is moved down and to the right. This

should be a contributing factor in the illusion. However,

there are most likely other causes to this illusion, maybe

Page 9: Uncertaintyinvisualprocessespredictsgeometricalopticalillu ...users.umiacs.umd.edu/~fer/postscript/geo_journal.pdf · opticalillusions,andalsobyextension,illusorypatterns due to motion

(a) (b)

Fig. 9. Poggendorff illusion.

C. Ferm€uller, H. Malm / Vision Research 44 (2004) 727–749 735

biases in higher-level processes which analyze larger re-

gions of the image to compare the different line seg-

ments.

From parametric studies it is known that the illusoryeffect decreases with an increase in the acute angle

(Cameron & Steele, 1905; Wagner, 1969). Our model

predicts this, as can be deduced from Fig. 10. For a

tilted line intersecting a vertical line at its mid-point in

an angle /, we plotted the value of the bias in x- and y-direction as a function of the angle /. As can be seen, as

the angle increases, the bias in both components de-

creases.Fig. 11 shows two versions of the well-known Z€ollner

illusion (Z€ollner, 1860). The vertical bands in Fig. 11a

and the diagonal lines in Fig. 11b (Hering, 1861) are all

parallel, but they look convergent or divergent. Our

theory predicts that in these patterns the biases in the

intersection points of the long lines (or edges of bands)

with the short line segments cause the edges along the

long lines between intersection points to be tilted. The

φ

φ10.80.60.4

-0.02

-0.03

-0.04

-0.05

-0.06

x

bias in x

(b)(a)

Fig. 10. (a) A tilted line intersects a vertical line at its mid-point at an angle /had length 2, and the edge elements with noise in the spatial derivatives of r ¼vertical and (c) bias parallel to the vertical.

estimation is illustrated in Fig. 12 for a pattern as in Fig.

11a with 45� between the vertical and the tilted bars. In a

second computational step, long lines are computed as

an approximation to these small edge pieces. If, as

discussed in the previous section, the positions and

orientations of the line elements are used in the

approximation, tilted lines or bars will be computed

which are in the same direction as perceived by the vi-sual system.

In experiments with this illusion it has also been

found that the effect decreases with an increasingly acute

angle between the main line and the obliques, which can

be explained as before using Fig. 10b and c (or, simi-

larly, Fig. 15). The value where the maximum occurs

varies among different studies. It is somewhere between

10� and 30�; below that, some counteracting effects seemto take place (Morinaga, 1933; Wallace & Crampin,

1969).

Other parametric studies have been conducted on the

effect of altering the orientations of the Poggendorff and

Z€ollner figures. The Poggendorff illusion was found to

be strongest with the parallel lines vertical or horizontal

(Green & Hoyle, 1964; Leibowitz & Toffey, 1966). The

Z€ollner illusion, on the other hand, was found to bemaximal when the judged lines were at 45� (Judd &

Courten, 1905; Morinaga, 1933), as in Fig. 11b.

The term ‘‘spatial norms’’ is used to refer to the

vertical and horizontal directions: we generally see bet-

ter in these orientations (Howard & Templeton, 1966).

There also is evidence from brain imaging techniques for

more activity in early visual areas (V1) for horizontal

and vertical than for oblique orientations (Furmanski &Engel, 2000). Based on these findings we can assume

there is more data for horizontal and vertical lines, that

is, more edge elements are estimated in these and nearby

directions. This in turn amounts to higher accuracy in

1.2φ

1.210.80.60.4

0.7

0.6

0.5

0.4

0.3

0.2

0.1

0

y

bias in y

(c)

. The data used was as follows: the tilted line had length 1, the vertical

0:08 were distributed at equal distances. (b) Bias perpendicular to the

Page 10: Uncertaintyinvisualprocessespredictsgeometricalopticalillu ...users.umiacs.umd.edu/~fer/postscript/geo_journal.pdf · opticalillusions,andalsobyextension,illusorypatterns due to motion

Fig. 11. (a) Z€ollner pattern. (b) Hering’s version of the Z€ollner pattern gives increased illusory effect.

Fig. 12. The estimation of edges in the Z€ollner pattern: the edge ele-

ments were found by connecting two consecutive intersection points,

resulting from the intersection of edges of two consecutive tilted bars

with the edge of the vertical bar (one in an obtuse and one in an acute

angle). As input we used edge elements uniformly distributed on the

vertical and on the tilted lines with 1.5 times more elements on the

vertical.

736 C. Ferm€uller, H. Malm / Vision Research 44 (2004) 727–749

the estimation of quantities in this direction. Intuitively,

in a direction where there are more estimates there is a

larger signal to noise ratio, and this results in greater

accuracy in this direction.

φ

ratio

x

54321

-0.05

-0.06

-0.07

-0.08

bias in x, φ = π/6

(a) (b)

Fig. 13. (a) A vertical line intersecting a tilted line at an angle / ¼ p=6 (the le

perpendicular to the vertical as a function of the ratio of edge elements on t

Expressed in our formalism of Eq. (6), this takes thefollowing form. The effect of the gradient distribution

on the bias is strongest in the direction of the eigen-

vector corresponding to the smaller eigenvalue of M 0

and weakest in the orthogonal direction. Changing the

ratio of measurements (edgels) along the different lines

changes the bias. Fig. 13 illustrates, for the case of a

vertical line intersecting an oblique line at an angle of

30�, the bias in the x- and y-directions as a function ofthe ratio of edge elements. It can be seen that as the ratio

of vertical to oblique elements increases, the bias in the

x-direction decreases and the bias in the y-direction in-

creases. The Poggendorff illusion is stronger when the

parallel lines are vertical or horizontal, because in this

case the bias parallel to the lines (along the y-axis in the

plot) is larger, and the Z€ollner illusion is stronger when

the small lines are horizontal and vertical and the mainlines are tilted, as in this case the bias perpendicular to

the main lines (along the x-axis in the plot) is larger.

Many other well-known illusions can be explained on

the basis of biased line intersection. Examples are the

Orbison figures (Orbison, 1939), Wundt’s figure

(Wundt, 1898), and the patterns of Hering (1861) and

Lukiesh (1922). In these patterns geometrical entities

such as straight lines, circles, triangles or squares aresuperimposed on differently tilted lines, and they appear

6ratio

y

654321

0.6

0.55

0.5

0.45

0.4

0.35

0.3

bias in y, φ = π/6

(c)

ngth of the vertical and the tilted line is one unit and r ¼ 0:06). (b) Bias

he vertical and tilted lines. (c) Bias parallel to the vertical.

Page 11: Uncertaintyinvisualprocessespredictsgeometricalopticalillu ...users.umiacs.umd.edu/~fer/postscript/geo_journal.pdf · opticalillusions,andalsobyextension,illusorypatterns due to motion

C. Ferm€uller, H. Malm / Vision Research 44 (2004) 727–749 737

to be distorted. This distortion can be accounted for by

the erroneous estimation of the tilt in the line elements

between intersection points and the subsequent fitting of

curves to these line elements.

Fig. 14 illustrates the estimation of the curve in the

Luckiesh pattern. Each line has two edges, and we

computed the intersection between any background

edge and circle edge (using configurations such as thosein Fig. 13a). This provided for every intersection of the

circle with a straight line four intersection points, two

corresponding to the inner edges of the circle and two to

the outer ones. Arcs on the circle between two consec-

utive background lines were approximated by straight

lines. (The ratio of elements between the circle and the

background edge was 2:1 (resulting in a configuration

similar to that of Fig. 15 with the arc corresponding tothe vertical.)––Note, other ratios give qualitatively

similar results.) Consecutive intersection points––one

originating from an obtuse and one from an acute

angle––were connected with straight line segments.

Bezier splines were then fitted to the outer line segments.

This resulted in a curve like the one we perceive, with the

circle being bulbed out on the upper and lower left and

bulbed in on the upper and lower right.

Fig. 14. Estimation of Luckiesh pattern: (a) the pattern––a circle superimpo

arcs to the circle, (c) magnified upper-left part of pattern with fitted arcs sup

intersections.

Next, let us look at the erroneous estimation of an-

gles. Assuming that the erroneously estimated intersec-

tion point has a distorting effect on the arms (Wallace,

1969), the bias discussed for the above illusions will re-

sult in the overestimation of acute angles. The under-

estimation of obtuse angles can be explained if we

assume an unequal amount of edgel data on the two

arcs. Fig. 15 illustrates the bias for acute and obtuseangles for the case of more edge elements on the vertical

than on the tilted line. The bias in the x-directionchanges sign and the bias in the y-direction increases

(with increasing angle) for obtuse angles, and this results

in a small underestimation of obtuse angles.

We used the intersection point as main criterion to

explain the illusions discussed in this section. But very

likely many of these illusions are due to the estimationof multiple features. In particular, for illusions involving

many small line segments, such as the Z€ollner pattern,

estimation of edges and estimation of intersection points

would have very similar results.

Illusions of intersecting lines have been intensely

studied, and there are also models that employ in some

form the concept of noise. Morgan and Casco (1990)

propose as explanation of the Z€ollner and Judd illusion

sed on a background of differently arranged parallel lines, (b) fitting of

erimposed, and (d) intersection points and fitting of segments to outer

Page 12: Uncertaintyinvisualprocessespredictsgeometricalopticalillu ...users.umiacs.umd.edu/~fer/postscript/geo_journal.pdf · opticalillusions,andalsobyextension,illusorypatterns due to motion

φ2.521.510.5

0.02

0

x -0.02

-0.04

-0.06

φ

y

2.521.510.5

0.8

0.6

0.4

0.2

bias in y, ratio = 3

bias in x, ratio = 3

(a) (b)

Fig. 15. A vertical line and a tilted line of length 1 intersecting at an angle /; the ratio of vertical to tilted edge elements is 3:1; r ¼ 0:06. (a) Bias

perpendicular to the vertical. (b) Bias parallel to the vertical.

738 C. Ferm€uller, H. Malm / Vision Research 44 (2004) 727–749

band-pass filtering followed by feature extraction insecond stage filters. The features are the extrema in the

band-pass filtered image, which correspond to the

intersection points. Morgan (1999) studied the Pog-

gendorff illusion and suggests smoothing in second stage

filters as the main cause.

4. Bias in motion

When processing image sequences some representa-

tion of image motion must be derived as a first stage. Itis believed that the human visual system computes two-

dimensional image measurements which correspond to

velocity measurements of image patterns, called optical

flow. The resulting field of measurements, the optical

flow field, represents an approximation to the projection

of the field of motion vectors of 3D scene points on the

image.

Optical flow is derived in a two-stage process. In afirst stage the velocity components perpendicular to

linear features are computed from local image mea-

surements. This one-dimensional velocity component is

referred to as ‘‘normal flow’’ and the ambiguity in the

velocity component parallel to the edge is referred to as

the ‘‘aperture problem.’’ In a second stage the optical

flow is estimated by combining, in a small region of the

image, normal flow measurements from features in dif-ferent directions, but this estimate is biased.

We consider a gradient-based approach to deriving

the normal flow. The basic assumption is that image

gray level does not change over a small time interval.

Denoting the spatial derivatives of the image gray level

Iðx; y; tÞ by Ix; Iy , the temporal derivative by It, and the

velocity of an image point in the x- and y-directions by~u ¼ ðu; vÞ, the following constraint is obtained:

Ixuþ Iyvþ It ¼ 0 ð7Þ

This equation, called the optical flow constraint equa-tion, defines the component of the flow in the direction

of the gradient (Horn, 1986). We assume the optical flow

to be constant within a region. Each of the n measure-

ments in the region provides an equation of the form (7)and thus we obtain the over-determined system of

equations

Is~uþ~I t ¼ 0 ð8Þ

where Is denotes, as before, the matrix of spatial gradi-

ents ðIxi ; IyiÞ, ~I t the vector of temporal derivatives, and~u ¼ ðu; vÞ the optical flow. The least squares solution to

(8) is given by

~u ¼ �ðI tsIsÞ�1ITs~I t ð9Þ

As a noise model we consider zero-mean i.i.d. noise in

the spatial and temporal derivatives, and for simplicity,

equal variance r2s for the noise in the spatial derivatives.

The statistics of (9) are well understood, as these are

the classical linear equations. The expected value of the

flow, using a second-order Taylor expansion, is derived

in Appendix A; it converges to

limn!1

Eð~uÞ ¼~u0 � r2s lim

n!1

1

nM 0

� �� ��1

~u0 ð10Þ

where, as before, the actual values are denoted byprimes.

Eq. (10) is very similar to Eq. (6) and the interpre-

tation given there applies here as well. It shows that the

bias depends on the gradient distribution (that is, the

texture) in the region. Large biases are due to large

variance, ill-conditioned M , or an~u which is close to the

eigenvector of the smallest eigenvalue of M . The esti-

mated flow is always underestimated in length, and it iscloser in direction to the direction of the majority of

normal flow vectors than the veridical.

Fig. 16 shows a variant of a pattern created by Ouchi

(1977). The pattern consists of two rectangular checker-

board patterns oriented in orthogonal directions––a

background orientation surrounding an inner ring.

Small retinal motions, or slight movements of the paper,

cause a segmentation of the inset pattern, and motion ofthe inset relative to the surround. This illusion has been

discussed in detail in (Ferm€uller, Pless, & Aloimonos,

2000). We will thus only give a short description here.

Page 13: Uncertaintyinvisualprocessespredictsgeometricalopticalillu ...users.umiacs.umd.edu/~fer/postscript/geo_journal.pdf · opticalillusions,andalsobyextension,illusorypatterns due to motion

Fig. 18. If fixating on the center and moving the paper along the line

of sight, the inner circle appears to rotate.Fig. 16. A pattern similar to the one by Ouchi.

C. Ferm€uller, H. Malm / Vision Research 44 (2004) 727–749 739

The tiles used to make up the pattern are longer than

they are wide leading to a gradient distribution in a

small region with many more normal flow measure-ments in one direction than the other. Since the tiles in

the two regions of the figure have different orienta-

tions, the estimated regional optical flow vectors are

different. The difference between the bias in the inset and

the bias in the surrounding region is interpreted as

motion of the ring. An illustration is given in Fig. 17 for

the case of motion along the first meridian (to the right

and up). In addition to computing flow, the visual sys-tem also performs segmentation, which is why a clear

relative motion of the inset is seen.

Another impressive illusory pattern from Pinna and

Brelstaff (2000) is shown in Fig. 18. If fixating on the

center and moving the page (or the head) along the

optical axis back and forth the inner circle appears to

rotate––clockwise with a motion of the paper away from

the eyes. For a backward motion of the paper the mo-tion vectors are along lines through the image center,

pointing away from the center. The normal flow vectors

Fig. 17. (a) The optical flow field. (b) The error vector field––the

difference between the estimated and the veridical motion. The line

from the center is the direction of the veridical motion.

are perpendicular to the edges of the parallelograms.

Thus the estimated flow vectors are biased in the

clockwise direction in the outer ring and in the coun-

terclockwise direction in the inner ring, as shown in Fig.

19. The difference between the inner and outer vectors

(along a line through the center) is tangential to the

circles, and this explains the perceived rotationalmovement. In a similar way one can explain the per-

ception of a spiral movement when rotating the pattern

around an axis through the center and perpendicular to

the paper. The illusory effect is very much decreased by

slightly changing the pattern as in Fig. 20. In this figure

the inserted diagonal lines make the local gradient dis-

tribution in the inner and outer parallelograms about

the same, and thus there is no significant difference inbias which could cause the perception of motion.

To be more rigorous, the estimation is somewhat

more elaborate. The vision system not only computes

Fig. 19. Flow field computed from Fig. 18.

Page 14: Uncertaintyinvisualprocessespredictsgeometricalopticalillu ...users.umiacs.umd.edu/~fer/postscript/geo_journal.pdf · opticalillusions,andalsobyextension,illusorypatterns due to motion

Fig. 20. Modification of Fig. 18 reduces the illusory motion.

Fig. 21. The eye motion gives rise to flow u0 and normal flow vectors n1and n2. The estimated flow, u, has a positive y-component, which

causes illusory motion upward in the band.

740 C. Ferm€uller, H. Malm / Vision Research 44 (2004) 727–749

normal flow and flow on the basis of the raw image. It

also smoothes, or blurs, the image and computes normal

flow and flow from the blurred image. That is, the vision

system utilizes flow at different levels of resolution. Forthe pattern here the flow vectors at different resolutions

are in the same orientation, thus there will be no dif-

ference whether the system used multiple resolutions or

not. However, one may modify the pattern by changing

the black and white in the edges, thus creating bias in

one direction at high resolution and bias in the other

direction at low resolution, and this way reduce the

illusion (as in Fig. 6 in Pinna & Brelstaff, 2000).Helmholtz (1962) describes an experiment with the

Z€ollner pattern which causes illusory motion. When the

point of a needle is made to traverse Z€ollner’s pattern

(Fig. 11a) horizontally from right to left, its motion

being followed by the eye, a perception of motion in the

bands occurs. The first, third, fifth and seventh black

bands ascend, while the second, fourth, and sixth des-

cend; it is just the opposite when the direction of themotion is reversed.

The bias predicts this effect as follows: a motion of

the eyes from right to left gives rise to optical flow from

left to right. For each band there are two different gra-

dient directions, i.e., there are two different normal flow

components in each neighborhood. For the odd bands

the two normal flow components are in the direction of

the flow and diagonally to the right and up. Thus theestimated flow makes a positive angle with the actual

flow (that is, it has a positive y-component), and this

component along the y-axis is perceived as upward

motion of the bands (see Fig. 21). For even bands the

estimated flow is biased downward, causing the per-

ception of descent of the bands. Similarly, if the motion

of the eye is reversed the estimated flow has a negative y-component in the odd bands and a positive y-compo-nent in the even bands, leading to a reversal in the

perceived motions of the bands.

We need to clarify here two issues. First, the simple

model of constant flow in a neighborhood in general is

not adequate to model image motion. We have used this

model here for the purpose of a simple analysis. This

model is sufficient only for fronto-parallel scenes and

translational motions parallel to the image plane. In the

general case we have to segment the flow field at the

discontinuities and within coherent regions allow forspatially smooth variation of the flow (Hildreth, 1983;

Horn & Schunk, 1981). Smoothness may be modeled by

describing the flow as a smooth function of position, or

by minimizing, in addition to the deviation from the

flow constraint equation, a function of the spatial

derivatives of the flow. But the principle of bias is still

there (Ferm€uller, Shulman, & Aloimonos, 2001).

Second, there are three models for computing opticalflow in the literature. Besides gradient-based models

there are frequency domain and correlation models, but

computationally they are not very different. In all the

models there is a stage in which smoothness assumptions

are made and measurements within a region are com-

bined. At this stage noisy estimates lead to bias

(Ferm€uller et al., 2001).There is a large number of motion experiments that

also can be explained by our model.

A significant body of work on the integration of local

velocity signals has been conducted in the context of

moving plaids. Plaids are combinations of two wave

gratings of different orientation, each moving with a

constant speed. For any such ‘‘moving plaid’’ there is

always some planar velocity the whole pattern can un-

dergo which would produce exactly the same retinalstimulus. Depending on the parameters of the compo-

nent gratings, the perception is of one coherent motion

of the plaid or of two different motions of the gratings.

The motion of a coherent pattern can be found the-

oretically with the intersection of constraint model

(IOC)––the vector component obtained from each

individual grating constrains the local velocity vector to

lie upon a line in velocity space, the intersection of thelines defines the motion of the plaid (Adelson &

Movshon, 1982). However, often the perceived motion

is different from the veridical. The error is influenced by

Page 15: Uncertaintyinvisualprocessespredictsgeometricalopticalillu ...users.umiacs.umd.edu/~fer/postscript/geo_journal.pdf · opticalillusions,andalsobyextension,illusorypatterns due to motion

Fig. 22. A spiral rotating in the clockwise direction gives rise to a flow

field with radially inward pointing vectors. The flow was derived using

LS estimation within small circular regions. (No smoothness con-

straints were enforced.)

C. Ferm€uller, H. Malm / Vision Research 44 (2004) 727–749 741

factors including orientation of the gratings, frequency,

and contrast.

Ferrera and Wilson (1987) made a distinction be-

tween two kinds of plaids; in type I plaids the common

motion is between the components motions of the two

gratings, and in type II plaids the common motion lies

outside the component directions. In case of equal

contrast and frequency, for type II plaids the velocity isperceived towards the average of the component vectors

(the VA) (Burke & Wenderoth, 1993; Ferrera & Wilson,

1990, 1991), whereas for type I plaids the estimate is

largely veridical. For type I plaids if the contrast of the

gratings is different the perceived velocity is closer in

direction to the component of the grating of higher

contrast (Kooi, Valois, Grosof, & Valois, 1992; Stoner,

Albright, & Ramachandran, 1990). If the spatial fre-quency of the gratings differs (Smith & Edgar, 1991), the

perceived motion is closer in direction to the gradients of

higher spatial frequency than the IOC velocity. In no

case is there an overestimate of the plaid velocity when

compared to the IOC prediction.

Recalling that the plaid velocity is biased in direction

toward the eigenvector of the larger eigenvalue ofM 0, we

can predict the expected bias from changes in the plaidpattern. For components of equal contrast and fre-

quency, the major eigenvector is in the direction of the

vector average of the component motion vectors. In type

II plaids this results in an estimated flow towards the VA

direction. If the contrast of one grating increases, the

major eigenvector moves towards the direction of mo-

tion of that grating explaining the findings of Stoner

et al. (1990) and Kooi et al. (1992). Higher frequencies ina direction amount to more measurements in that

direction. In the case of orthogonal gratings, as in Smith

and Edgar (1991), this results in an estimated flow closer

in direction to the motion of the higher frequency.

Mingolla, Todd, and Norman (1992) studied stimuli

consisting of lines moving behind apertures. The lines

had one of two different orientations. For a motion to

the right, and lines at orientation of 15� and 45� fromthe vertical, that is when the normal flow components

were in the first quadrant (upwards and right) the mo-

tion appeared upward biased. With the normal flow

components in the second quadrant (downwards and

right––the lines at )15� and )45� from the vertical) the

motion appeared downward biased and for symmetric

lines (+15� and )15�) the motion was perceived as

horizontal. As in the case of type II plaids, this can bepredicted from the bias in flow estimation which changes

the direction of the estimated flow towards the vector

average direction.

Circular figures rotating in the image plane may not

give the perception of a rigid rotation (Musatti, 1929;

Wallach, 1935). The effect is well known for the spiral

which appears to contract or expand depending whether

the rotation is clockwise or counterclockwise. For a

rotation around the spiral’s center the motion vectors

are tangential to circles and the normal flow vectors are

close to the radial direction. This situation creates large

bias in the regional estimation of flow and thus an

additional motion component in the radial direction as

illustrated in Fig. 22. This illusion, among many others,

has been explained by Hildreth (1983) by means of a

model of smooth flow. The smoothness constraint alsocontributes to radial motion as it penalizes positional

change in flow. As discussed above, a complete flow

model also needs to consider variation in the flow.

However, even with small amounts of noise, the con-

tribution of the bias to the radial motion component will

be larger than the contribution from smoothness.

Noise in the normal flow measurements has been

discussed before as a possible cause for misperceivedmotion. In Nakayama and Silverman (1988a, 1988b)

and Ferrera and Wilson (1991) Monte Carlo experi-

ments were performed to determine the expected value

and variance of velocity calculated with the IOC meth-

od. The experiments proceeded by creating one-dimen-

sional motion components which were corrupted with

error of Gaussian distribution, and then computing for

pairs of local velocity measurements the motion vectorwith the IOC model. The distribution of estimates cre-

ated with this method was not found to be significantly

biased away from the IOC prediction (Ferrera & Wil-

son, 1991) for the case of plaids, but the variance of

these estimates was found to be correlated with the

accuracy of directional perception (Nakayama & Silv-

erman, 1988a).

Nakayama and Silverman (1988a, 1988b) found thatsmooth curves, including sinusoids, Gaussians and

sigmoids may be perceived to deform non-rigidly when

Page 16: Uncertaintyinvisualprocessespredictsgeometricalopticalillu ...users.umiacs.umd.edu/~fer/postscript/geo_journal.pdf · opticalillusions,andalsobyextension,illusorypatterns due to motion

Fig. 23. (a, b) Regional LS estimation of flow. (c, d) Flow estimation using smoothness constraints.

1 The non-Bayesian method of LS estimation implicitly also

assumes priors; it assumes that all solutions have equal probability.

742 C. Ferm€uller, H. Malm / Vision Research 44 (2004) 727–749

translated in the image plane. For example, a sinusoid

with low curvature (due to low amplitude or low fre-

quency) moving with horizontal translation appears to

deform non-rigidly, while the same motion for a sinu-

soid of high curvature appears veridical. In this case the

main difference between the two curves lies in the

amount of biased regional flow estimates as illustratedin Fig. 23. The flow field in Fig. 23a and b were esti-

mated using LS estimation within small regions. As can

be seen, in regions next to the inflexion point the flow

vectors are biased upward on one side and downward on

the other, and these regions are much smaller in the

curve with larger curvature. The flow fields in Fig. 23c

and d were estimated by enforcing in addition smooth-

ness constraints. The smoothness propagates the bias,resulting in vertical flow components which are upward

in half of the low curvature sinusoid and downward in

the other half. In comparison the bias is much less in the

high curvature sinusoid, effecting only the areas at the

center of the curve.

Nakayama and Silverman (1988a) attributed the

phenomenon to the large variance in the distribution of

flow estimates. Clearly, the bias in regional flow esti-mates and the variance of the distribution of all the

estimates are correlated; the larger the bias, the larger

the variance. As Nakayama and Silverman (1988a)

point out, the variance should be a good measure for the

process of segmentation, that is to decide on coherence

or non-coherence.

Simoncelli, Adelson, and Heeger (1991) and Weiss

and Adelson (1998) also discuss noise in the normal flow

estimates, but they only consider noise in the temporal

derivatives. These noise terms do not cause bias; it is the

noise in the spatial derivatives (i.e., the orientation of the

local image velocity components) which causes bias.

Weiss and Adelson, however, conclude bias in flowestimation using Bayesian modeling. Their explanation

is based on the assumption that there is an a priori

preference for small flow values. It is easily understood

that this preference results in an increase in the a pos-

teriori probabilty of small flow values and thus a bias

towards underestimation of the flow, and thus a flow

estimation similar to ours. Note, thus, in the Bayesian

model the bias is in effect assumed, whereas in our modelit is not. 1

5. The theoretical question

The natural question to ask is whether the bias is due

to the particular computations we described, or whether

it is an inherent problem. In other words, is the esti-

mation of features biased in whatever way it is con-

ducted, or is bias a feature only of linear estimation? Are

Page 17: Uncertaintyinvisualprocessespredictsgeometricalopticalillu ...users.umiacs.umd.edu/~fer/postscript/geo_journal.pdf · opticalillusions,andalsobyextension,illusorypatterns due to motion

C. Ferm€uller, H. Malm / Vision Research 44 (2004) 727–749 743

there other methods which do not suffer from bias to

begin with, or is it maybe possible to estimate the bias

and then correct for it? Our answer is, in general it is

not. This section is a brief statistical discussion

explaining what could be done, why it is very hard to

correct for the bias and that the theoretically best thing

to do is to partially correct for it.

The mathematical problem at hand (for edge inter-section and flow estimation) is to find a solution to an

over-determined system of equation of the form A~x ¼~b.The observations A and ~b are corrupted by noise, i.e.,

A ¼ A0 þ dA and~b ¼~b0 þ d~b. In addition there is system

error, ~� (because the equations are only approxima-

tions), and thus A0~x ¼~b0 þ~�. We are dealing with, what

is called in statistical regression the errors in variable

(EIV) model.It is well known, that the least squares (LS) estimator

is biased (Fuller, 1987). The LS solution, which is linear

in~b, is an asymptotically unbiased estimator only, when

the errors dA are 0 and the errors d~b are independent.

The classical way to address the problem is by means

of the corrected least squares (CLS) estimator. If the

statistics of the noise, that is the covariance matrix of

dA, were known, an asymptotically unbiased linearestimation could be obtained. The problem is that for

small amounts of data, accurate estimation of the vari-

ance of the noise is a very difficult problem. In this case,

the estimation of the variance is not accurate, leading to

large variance for CLS.

In recent years the technique of total least squares

(TLS) has received a lot of attention. The basic idea

underlying this non-linear technique is to deal with theerrors in A and ~b symmetrically. If all the errors in dAand d~b are identical and independent, TLS is asymp-

totically unbiased. In the case they are not, one would

need to whiten the data. But this requires the estimation

of the ratio of the error variances dA and d~b, which is at

least as hard as obtaining the variance of dA. An

incorrect value of the ratio often results in an unac-

ceptably large over correction for the bias. However, themain problem for TLS is system error. Theoretically one

can use multiple tests to obtain the measurement errors,

like re-measuring or re-sampling; but unless the exact

parameters of the model are known, one cannot test for

system error.

Re-sampling techniques, such as bootstrap and Jac-

knife have been discussed for bias correction. These

techniques can correct for the error term which is inverseproportional to the number of data points (i.e., Oð1=nÞ),and thus they can improve the estimate of the mean for

unbiased estimators. However, these techniques cannot

correct for the bias (Efron & Tibshirani, 1993). They are

useful for estimating the variance in order to provide

confidence intervals.

Why is it so difficult to obtain accurate estimates of

the noise parameters? To acquire a good noise statistic a

lot of data is required, so data needs to be taken from

large spatial areas acquired over a period of time, but

the models used for the estimation can only be assumed

to hold locally. Thus to integrate more data, models of

the scene need to be acquired. Specifically, in the case of

intersecting lines, first long edges and bars need to be

detected, and in the case of motion, first discontinuities

due to changes in depth and differently moving entitiesneed to be detected and the scene segmented. If the noise

parameters stayed fixed for extended periods of time it

would be possible to acquire enough data to closely

approximate these parameters, but usually the noise

parameters do not stay fixed long enough. Sensor

characteristics may stay fixed, but there are many other

sources of noise besides sensor noise. The lighting con-

ditions, the physical properties of the objects beingviewed, the orientation of the viewer in 3D space, and

the sequence of eye movements all have influences on the

noise.

Clearly, bias is not the only thing that matters. There

is a trade-off between bias and variance. Generally, an

estimator correcting for bias increases the variance while

decreasing the bias. Very often, the mean squared error

(MSE) is used as a criterion for the performance of anestimator. It is the expected value of the square of the

difference between the estimated and the true value. If x0

is used to denote the actual value, x̂ to denote the esti-

mate, and Eð�Þ to denote the expected values, the MSE is

defined as

MSEðx̂Þ ¼ Eððx̂� x0Þ2Þ ¼ ðEðx̂Þ � x0Þ2 þ Eðx0 � Eðx̂ÞÞ2

¼ bias2ðx̂Þ þ covðx̂Þ ð11Þ

that is, as the sum of the square of the bias (denoted as

biasðx̂Þ) and the variance (denoted as covðx̂Þ). Based on

the criterion of minimizing the MSE, there are com-

promised estimators which outperform both uncor-rected biased estimators and corrected unbiased

estimators. The ideal linear estimator would perform a

partial bias correction using CLS. In Appendix B we

derive how much this correction (theoretically) should

be. It depends on the covariance of the LS estimates.

The larger the covariance, the less correction.

This brings us to the point of a psychometric function

describing quantitatively the effects of noise on theperception. We should assume that our vision system is

doing its best and thus has learned to correct whenever it

has data available to obtain estimates of the noise. Since

the correction should depend on the variance, a measure

of misestimation should involve both the bias and the

variance of the LS estimator; we suggest to investigate a

weighted sum of these two components. Such a measure

may explain why the illusory perception in some of thepatterns weakens after extended viewing, in particular

when subjects are asked to fixate (Helmholtz, 1962; Yo

& Wilson, 1992). In this case, we can assume the noise

Page 18: Uncertaintyinvisualprocessespredictsgeometricalopticalillu ...users.umiacs.umd.edu/~fer/postscript/geo_journal.pdf · opticalillusions,andalsobyextension,illusorypatterns due to motion

744 C. Ferm€uller, H. Malm / Vision Research 44 (2004) 727–749

parameters stay fixed, and the visual system can rea-

sonably well estimate them. Furthermore, such a mea-

sure would predict that the density of a pattern should

have an influence on the perception, with slightly more

accurate estimates for denser patterns.

Finally, one may think there are ways to correct the

bias without using statistics. The bias depends on the

gradient distribution, that is the directions of edgels ormotion measurements. Since an uneven distribution

(corresponding to an ill-conditioned matrix M) leads to

a large bias, one may want to normalize for direction.

We could do so, if we had very large amounts of data.

However, with few measurements we cannot. Consider

that there are only two directions. It would mean that

we give more weight to the direction with few mea-

surements, which has large variance, and lesser weight tothe other direction which has smaller variance. With few

measurements only, this could lead to large over cor-

rection.

Also, it should be clear that in our models of feature

estimation, we cannot employ higher-level knowledge

about the structure of the scene. For example, com-

puting the intersection of lines by first estimating the

average direction of each individual line and thenintersecting the lines, would not give bias. But, this

would require that the noisy elements are classified first

into two categories. The same applies for the flow in

plaids. If we wanted to estimate the motion of the

individual gratings, we would first need to understand

that there are exactly two different directions and clas-

sify the motion components. If, as in Ferrera and Wil-

son (1991) pairs of noisy measurements are intersected,and the elements are randomly sampled (without clas-

sifying them first) there will be a bias in the direction of

the flow component with more measurements. If the

elements are segmented and each pair has an element of

either grating, there should be a slight overestimation in

magnitude. This is, because the underestimation in LS

results from the asymptotic behavior; terms linear in 1=nor higher do not effect the bias, but these terms for veryfew measurements, i.e., two in this case, cause a bias in

the opposite direction.

6. Discussion

6.1. Theories of illusions

Theories about illusions have been formulated ever

since their discovery. Many of the theories are aimed

only at one specific illusion and most of the early the-

ories would be seen nowadays as adding little to a mere

description of the illusion. Most theories which attemptto explain a broad spectrum of illusions are based on a

specific sort of mechanism which has been suggested to

explain the workings of human visual perception. These

mechanisms are either hypothetical, based on physical

analogies, or general observations about the physiology

and architecture of the brain (for example, lateral inhi-

bition). The theory proposed here is of a mathematical

nature based on the geometry and statistics of capturing

light rays and computing their properties (gray values

and spatio-temporal derivatives), and thus it applies to

any vision system, artificial or biological. However, onemight find that our theory resembles some existing

theories if they are put in a statistical framework.

Chiang (1968) proposed a retinal theory that applies

to patterns in which lines running close together affect

one another. He likened the light coming into the eye

from two lines to the light falling onto a screen from two

slit sources in the study of diffraction patterns. This is

because of the blurring and diffusing effect of the med-ium and the construction of the eye. The perceived

locations of the lines are at the peaks of their distribu-

tions; thus two close lines influence each other’s loca-

tions and become one when the sum of their

distributions forms a single peak. This leads to an

overestimation of acute angles, and provides an expla-

nation of the Z€ollner, Poggendorff, and related illusions

as well as the M€uller-Lyer effect. Glass (1970) discussedoptical blurring as a cause for the perceptual enlarge-

ment of acute angles as it fills in the angle at line

intersections and he proposed this as an explanation of

the Z€ollner and Poggendorff effect. Ginsburg argued

that neural blurring occurs because of lateral inhibition

and this has the effect of spatial frequency filtering. In

Ginsburg (1975) he suggested that low frequency

attenuation contributes to the formation of the illusoryKanizsa triangle and in Ginsburg (1984) he discussed

filtering processes as a cause for the M€uller-Lyer illu-

sion. In a number of recent studies band-pass filtering

has been proposed as a cause of geometrical optical

illusions; Morgan and Moulden (1986), and Earle and

Maskell (1993) discuss linear band-pass filters as a cause

of the caf�e wall illusion. Morgan and Casco (1990) dis-

cuss band-pass filtering in conjunction with second-order stage filters that extract features in the Z€ollner andJudd illusion. Bulatov et al. (1997) propose an elaborate

neurophysiological model and they discuss the M€uller-Lyer and Oppel–Kundt illusion. Morgan (1999) suggests

blurring in second stage filters as the main cause of the

Poggendorff illusion and (Morgan & Glennerster, 1991;

Morgan, Hole, & Glennerster, 1990) discuss large

receptive second stage eclectic units, which obtain asinput heterogeneous features from smaller subfields, as

cause for the M€uller-Lyer illusion.The diffraction in Chiang’s model, the optical and

neural blurring in later models, amounts to uncertainty

in the location of the perceived gray-level values, or they

can be interpreted as noise occurring somewhere in the

image formation or the image processing. Thus the

concept of uncertainty is invoked in vague ways in these

Page 19: Uncertaintyinvisualprocessespredictsgeometricalopticalillu ...users.umiacs.umd.edu/~fer/postscript/geo_journal.pdf · opticalillusions,andalsobyextension,illusorypatterns due to motion

C. Ferm€uller, H. Malm / Vision Research 44 (2004) 727–749 745

studies. The models that have been discussed are, how-

ever, very restricted. They apply to particular processes,

either on the retina or the neural hardware.

There are a number of theories in which eye move-

ments are advanced as an important causative factor in

illusions. Our theory also proposes that eye movements

play a role because they are a relevant source of noise.

The particular eye movements made in looking at apattern influence the noise distribution and thus the bias

perceived, but there are other noise sources besides eye

movements, and this predicts the existence of illusory

effects for some patterns even under fixation or tachis-

toscopic viewing.

Helmholtz (1962) suggested that ocular movements

are of importance in some illusions, but he also ex-

pressed doubt that they could be the main source, asother illusions are not influenced by them. Carr (1935)

proposed that the eyes react to accessory lines and as a

result pass more easily over unfilled than filled elements.

In the M€uller-Lyer figures the eyes move more freely

over the figure with outgoing fins than over the one with

ingoing fins, and in the Poggendorff and Z€ollner figuresdeflections and hesitations in the eye movements are

associated with the intersections of the long lines withthe obliques. Piaget (1961) proposed a ‘‘law of relative

centrations.’’ By ‘‘centration’’ he refers to a kind of

centering of attention which is very much related to

fixation. Centration on part of the field causes an

overestimation of that part relative to the rest of the

field. Virsu (1971) suggested that a tendency to eye

movements, that is, instructions for eye movements, has

a perceptual effect. He suggests that the eye movementsmost readily made are linear and rectilinear, horizontal

or vertical. When viewing lines which lie off the vertical

or horizontal, an eye movement correction must take

place and this can give rise to perceptual distortion.

Other theories include those whose main objective

was the explanation of figural aftereffects applied to

illusions (Ganz, 1966; K€ohler & Wallach, 1944). In these

theories interference between nearby lines occurs be-cause of satiation in the cortex or lateral inhibition

processes. There are also theories based on the

assumption that the perceptual system interprets illusory

patterns as flat projections of three-dimensional displays

(Tausch, 1954; Thi�ery, 1896). The most detailed and

most popular such theory is due to Gregory (1963), who

invokes ‘‘size constancy scaling,’’ which can be triggered

in two different ways, either unconsciously or by higher-level awareness.

6.2. Illusions of size

Following Boring (1942), patterns which are calledgeometrical optical illusions may be classified roughly

into two categories: illusions of direction (where orien-

tation of a line or figure is misjudged) and illusions of

extent (where size or length is misjudged). The patterns

discussed so far are in the first class. They have been

explained on the basis of noise in the local measure-

ments. Most illusions in the second class could be ex-

plained if noise were present in the representations of

quantities which cover larger regions. Such noise may be

attributed to the processing in the visual system. From

evidence we have about the varying size of receptivefields, it is understood that images are processed at

multiple levels of resolution (Zeki, 1993). Since the

higher-level processes of segmentation and recognition

need information from large parts of the image, there

must be processes that combine information from local

neighborhoods into representations of global informa-

tion, and these processes may carry uncertainty.

Intuitively, noise effecting regions of larger spatialextent, causes blurring over these regions. Thus, neigh-

boring parts in a figure influence each other and as a

result they are perceived closer in distance. This obser-

vation can explain most of the significant illusions in the

second class which are: the contrast effect; the illusion of

filled and unfilled extent, that is, the effect that a filled

extent is overestimated when compared with an unfilled

extent (an example is the Oppel–Kundt illusion (Kundt,1863)); the framing effect, that is, the overestimation of

framed objects; and the flattening of arcs, that is, the

effect that short arcs are perceptually flattened. Take as

an example the Delboeuf illusion (Delboeuf, 1892)––two

concentric circles with the outer one perceptually de-

creased and the inner one increased. Blurring which ef-

fects both circles will cause the inner one to expand and

the outer one to contract. Another example is the fa-mous M€uller-Lyer illusion. Bias in the intersection

points due to noise in local measurements can account

for a small size change in the vertical lines, but it re-

quires noise in larger regions to account for significant

size changes as experienced with this pattern. Similarly,

such noise will increase the effect in the Poggendorff

illusion, and it may account for the perception in vari-

ations of this pattern.

6.3. Bias in 3D

Many other computations in the visual system, be-

sides feature extraction, are estimation processes.Clearly, the visual recovery processes are estimations––

that is, the computations which extract physical and

geometrical properties of the scene by inverting the

(optical and geometrical) image formation. Examples

are the estimations of the light source, the motion, the

structure and shape of the scene and the material

properties of surfaces.

We recently started investigating these computations.Results are available for shape from motion, stereo and

texture from all three known cues, that is foreshorten-

ing, scaling and position (Hui & Ferm€uller, 2003). We

Page 20: Uncertaintyinvisualprocessespredictsgeometricalopticalillu ...users.umiacs.umd.edu/~fer/postscript/geo_journal.pdf · opticalillusions,andalsobyextension,illusorypatterns due to motion

746 C. Ferm€uller, H. Malm / Vision Research 44 (2004) 727–749

analyzed the bias and found that it is consistent with

what is empirically known about the estimation of

shape. It has been observed from computational as well

as psychophysical experiments, that for many configu-

rations there is a tendency to underestimate the slant.

(The slant refers to the angle between the surface normal

and the optical axis.) In other words, the scene appears

compressed in the depth dimension. The bias predictsthis underestimation of slant.

We created an illusory display demonstrating the ef-

fect for 3D motion which can be viewed at (Ferm€uller,2003). It shows a plane with two textures, one in the

upper half, one in the lower half. The camera moves

around the plane while fixating on its center. Because

the bias in the upper texture is much larger than in the

lower one, the plane appears to be segmented into twoparts, with the upper part of much smaller slant. This

should demonstrate our point. The bias not only is a

problem of 2D vision, but of 3D vision as well.

Appendix A. Expected value of the least squares solution

In this section a second-order Taylor expansion of the

expected values of the least squares solution for both the

intersection point (Section 3) and the flow (Section 4) is

given.

Let~y be the vector to be estimated, that is, either theintersection point ~x or the flow ~u. Is is the matrix con-

sisting of the spatial derivatives Ixi ; Iyi . ~c in Section 3 is

the vector of temporal derivatives Iti and~c in Section 4 is

the vector whose elements are Ixix0i þ Iyi y0i .The expected value Eð~yÞ of the least squares solution

is given by

Eð~yÞ ¼ E ðITs IsÞ�1ðITs~cÞ

� As the noise is assumed to be independent and zero-mean, the first-order terms as well as the second-order

terms in the noise of the temporal derivatives (or the

positional parameters) vanish. This means that it is only

the noise in the spatial derivatives which causes bias in

the mean. The expansion at points N ¼ 0 (i.e.,

dIxi ¼ dIyi ¼ dIti ¼ 0 or dIxi ¼ dIyi ¼ dx0i ¼ dy0i ¼ 0) can

be written as

Eð~yÞ ¼~y 0 þXi

o2~yodI2xi

%N¼0

EðdI2xiÞ2

þ o2~y

odI2yi

%N¼0

EðdI2yiÞ2

!

For notational simplicity, we define

M ¼ ITs Is; ~b ¼ ITs~c

M 0 ¼ I 0sTI 0s; ~b0 ¼ I 0s

T~c0

To compute the partial derivatives, the explicit terms of

the matrix M are

M ¼P

iðI 0xi þ dIxiÞ2 P

iðI 0xi þ dIxiÞðI 0yi þ dIyiÞPiðI 0xi þ dIxiÞðI 0yi þ dIyiÞ

PiðI 0yi þ dIyiÞ

2

" #

and the terms of ~b are

~b ¼P

iðI 0xi þ dIxiÞðI 0ti þ dItiÞPiðI 0yi þ dIyiÞðI 0ti þ dItiÞ

" #

and

~b ¼

PiðI 0xi þ dIxiÞ

2ðx00i þ dx0iÞþðI 0xi þ dIxiÞðI 0yi þ dIyiÞðy00i þ dy0iÞPiðI 0xi þ dIxiÞðI 0yi þ dIyiÞðx00i þ dx0iÞþðI 0yi þ dIyiÞ

2ðy00i þ dy0iÞ

26664

37775

Using the fact that for an arbitrary matrix Q

�oQ�1

ox¼ Q�1 oQ

oxQ�1

we find the first-order and second-order derivatives to be

o~yodIxi

¼ �M�1 2Ixi IyiIyi 0

� �M�1~bþM�1 o~b

odIxi

o2~yodI2xi

¼ 2M�1 2Ixi IyiIyi 0

� �M�1 2Ixi Iyi

Iyi 0

� �M�1~b

�M�1 2 0

0 0

� �M�1~b� 2M�1 2Ixi Iyi

Iyi 0

� �M�1 o~b

odIxi

þM�1 o2~bodI2xi

and similarly we have symmetric expressions for

o~yodIyi

ando2~yodI2yi

Since we assume EðdI2xiÞ ¼ EðdI2yiÞ ¼ r2s , the expansion

can thus be simplified to

Eð~yÞ ¼~y0 � nM 0�1

~y0r2s þ

Xni¼1

M 0�1 o2~bodI2xi

þ o2~bodI2yi

! %N¼0

r2s

2

þXi

M 0�1 2I 0xi I 0yiI 0yi 0

" #M 0�1 2I 0xi I 0yi

I 0yi 0

" # (

þ 0 I 0xiI 0xi 2I 0yi

" #M 0�1 0 I 0xi

Ixi 2I 0yi

" #!~y 0

�M 0�1 2I 0xi I 0yiI 0yi 0

" #M 0�1 o~b

odIxi

%N¼0

þ 0 I 0xiI 0xi 2I 0yi

" #M 0�1 o~b

odIyi

%N¼0

!)r2s

where we have underlined the terms that do not depend

on n (where n is the number of measurements being

combined in a region). These terms will give a consis-

Page 21: Uncertaintyinvisualprocessespredictsgeometricalopticalillu ...users.umiacs.umd.edu/~fer/postscript/geo_journal.pdf · opticalillusions,andalsobyextension,illusorypatterns due to motion

C. Ferm€uller, H. Malm / Vision Research 44 (2004) 727–749 747

tent, statistically constant response. The rest of the

terms diminish proportionally to 1=n. Informal experi-

ments show that these terms become negligible for

n > 5, a number clearly smaller than the number of

terms likely to be combined in any real system.

In the analysis of Section 3, the elements of ~c are

Ixix0i þ Iyi y0i ; thus

o~bodIxi

kN¼0

¼2I 0xi x

00iþ I 0yi y

00i

I 0yi x00i

" #o2~bodI2xi

kN¼0

¼ 2x00i0

� �

o~bodIyi

kN¼0

¼I 0xi y

00i

I 0xi x00iþ 2I 0yi y

00i

" #o2~bodI2yi

kN¼0

¼ 0

2y 00i

� �

and the main terms in the expected value of the inter-section point Eð~xÞ are

Eð~xÞ ¼~x0 � nM 0�1

~x0r2s þM 0�1

Px00iPy 00i

" #r2s

or

Eð~xÞ ¼~x0 þ nM 0�1ð~�x00 �~x0Þr2s

where~�x00 denotes the mean of the values~x00i .In the analysis in Section 4 the elements of ~c are Iti .

Thus the first-order derivatives are

o~bodIxi

%N¼0

¼ I 0ti0

� �and

o~bodIyi

%N¼0

¼ 0

I 0ti

� �

and the second-order derivatives vanish. The expected

value of the flow Eð~uÞ simplifies to

Eð~uÞ ¼~u0 � nM 0�1

~u0r2s :

Appendix B. Bias correction

Let us write the equation system as A~x ¼~b. Let ~xLSdenote the LS estimator for the errors in variable model.

It’s expected bias amounts to

biasð~xLSÞ ¼ limn!1

Eð~xLS �~x0Þ ¼ �r2 limn!1

1

nA0TA0

� �� ��1

~x0

Assuming the variance of the error dA is known, the bias

can be removed with the CLS estimator, which amounts

to

~xCLS ¼ ðATA� nr2IÞ�1ðAT~bÞand can be rewritten as

~xCLS ¼ ðI � nr2ðATAÞ�1Þ�1~xLS

Denoting the variance as covð�Þ, the variance of ~xCLSamounts to

covð~xCLSÞ ¼ ðI � nr2ðATAÞ�1Þ�2covð~xLSÞ

Let us now investigate what the theoretically best linear

estimator should be. We have to adjust the corrected

least squares estimator, such that we achieve smaller

MSE (as defined in Eq. (11)). Let xa ¼ a~xCLS þð1� aÞ~xLS denote the adjusted CLS estimator.

Then we have that

MSEðxaÞ ¼ a2covð~xCLSÞ þ ð1� aÞ2bias2ð~xLSÞ¼ a2ðI � nr2ðATAÞ�1Þ�2

covð~xLSÞþ ð1� aÞ2bias2ð~xLSÞ

which is a quadratic expression. Thus, the MSE mini-mum is achieved for

a ¼ bias2ð~xLSÞbias2ð~xLSÞ þ covð~xLSÞðI � nr2ðATAÞ�1Þ�2

This shows that according to the MSE criterion a less

bias corrected xa is better than a bias corrected~xCLS. Thelarger the variance of the LS estimates, the less the

correction should be.

References

Adelson, E. H., & Movshon, J. A. (1982). Phenomenal coherence of

moving visual patterns. Nature, 300(December), 523–525.

Blasdel, G. G. (1992). Differential imaging of ocular dominance and

orientation selectivity in monkey striate cortex. Journal of Neuro-

science, 12, 3115–3138.

Boring, E. G. (1942). Sensation and perception in the history of

experimental psychology. New York: Appleton-Century-Crofts.

Bulatov, A., Bertulis, A., & Mickiene, L. (1997). Geometrical illusions:

Study and modelling. Biological Cybernetics, 77, 395–406.

Burke, D., & Wenderoth, P. (1993). The effect of interactions between

one-dimensional component gratings on two-dimensional motion

perception. Vision Research, 33, 343–350.

Cameron, E., & Steele, W. (1905). The Poggendorff illusion. Psycho-

logical Monographs, 7, 83–107.

Canny, J. (1986). A computational approach to edge detection. IEEE

Transactions on Pattern Analysis and Machine Intelligence, 8, 679–

698.

Carpenter, R. H. S. (1988). Movements of the eye. London: Pion.

Carr, H. A. (1935). An introduction to space perception. Green, New

York: Longmans.

Chiang, C. (1968). A new theory to explain geometrical illusions

produced by crossing lines. Perception and Psychophysics, 3, 174–

176.

Delboeuf, J. L. R. (1892). Sur une nouvelle illusion d’optique. Bulletin

de l’Academie Royale de Belgique, 24, 545–558.

Earle, D. C., & Maskell, S. (1993). Fraser cords and reversal of the caf�ewall illusion. Perception, 22, 383–390.

Efron, B., & Tibshirani, R. (1993). An introduction to the boostrap.

Chapman & Hall.

Ferm€uller, C. (2003). Available: http://www.optical-illu-

sions.org.

Ferm€uller, C., Pless, R., & Aloimonos, Y. (2000). The Ouchi illusion

as an artifact of biased flow estimation. Vision Research, 40,

77–96.

Ferm€uller, C., Shulman, D., & Aloimonos, Y. (2001). The statistics of

optical flow. Computer Vision and Image Understanding, 82, 1–32.

Ferrera, V. P., & Wilson, H. R. (1987). Direction specific masking and

the analysis of motion in two dimensions. Vision Research, 27,

1783–1796.

Page 22: Uncertaintyinvisualprocessespredictsgeometricalopticalillu ...users.umiacs.umd.edu/~fer/postscript/geo_journal.pdf · opticalillusions,andalsobyextension,illusorypatterns due to motion

748 C. Ferm€uller, H. Malm / Vision Research 44 (2004) 727–749

Ferrera, V. P., & Wilson, H. R. (1990). Perceived direction of

two-dimensional moving patterns. Vision Research, 30, 273–

287.

Ferrera, V. P., & Wilson, H. R. (1991). Perceived speed of moving two-

dimensional patterns. Vision Research, 31, 877–893.

Fraser, J. (1908). A new visual illusion of direction. British Journal of

Psychology, 2, 307–320.

Fuller, W. (1987). Measurement error models. New York: Wiley.

Furmanski, C., & Engel, S. (2000). An oblique effect in human primary

visual cortex. Nature Neuroscience, 3, 535–536.

Ganz, L. (1966). Mechanism of the F.A.E.’s. Psychological Review, 73,

128–150.

Gillam, B. (1998). Illusions at century’s end. In Perception and

cognition at century’s end (pp. 95–136). Academic Press.

Ginsburg, A. P. (1975). Is the illusory triangle physical or imaginary?

Nature, 257, 219–220.

Ginsburg, A. P. (1984). Visual form perception based on biological

filtering. In L. Spillman & B. R. Wooton (Eds.), Sensory

experience, adaptation and perception (pp. 53–72). New Jersey: L.

Erlbaum.

Glass, L. (1970). Effect of blurring on perception of a simple geometric

pattern. Nature, 228, 1341–1342.

Green, R., & Hoyle, G. (1964). Adaptation level and the optic-

geometric illusions. Nature, 201, 1200–1201.

Gregory, R. L. (1963). Distortion of visual space as inappropriate

constancy scaling. Nature, 119, 678–680.

Grossberg, S., & Mingolla, E. (1985a). Neural dynamics of form

perception: Boundary completion, illusory figures, and neon color

spreading. Psychological Review, 92(2), 173–211.

Grossberg, S., & Mingolla, E. (1985b). Neural dynamics of perceptual

grouping: Textures, boundaries and emergent segmentations.

Perception and Psychophysics, 38(2), 141–171.

Helmholtz, H. L. F. V. (1962). Treatise on physiological optics (Vol.

III). New York: Dover, translated from the third German edition

by J.P.C. Southall.

Hering, E. (1861). Beitr€age zur Psychologie (Vol. 1). Leipzig: Engel-

man.

Hildreth, E. (1983). The measurement of visual motion. Cambridge,

MA: MIT Press.

Horn, B. K. P. (1986). Robot vision. New York: McGraw-Hill.

Horn, B. K. P., & Schunk, B. G. (1981). Determining optical flow.

Artificial Intelligence, 17, 185–203.

Howard, I., & Templeton, W. (1966). Human spatial orientation. New

York: Wiley.

Hubel, D., & Wiesel, T. (1968). Receptive fields and functional

architecture of the monkey striate cortex. Journal of Physiology

(London), 195, 215–243.

Hubel, D. H., & Wiesel, T. N. (1961). Integrative action in the cat’s

lateral geniculate body. Journal of Physiology (London), 155, 385–

398.

Hui, J., & Ferm€uller, C. (2003). Uncertainty in 3d shape estimation. In

Proceedings of the 3rd international workshop on statistical and

computational theories of vision, to appear.

Judd, C., & Courten, H. (1905). The Z€ollner illusion. Psychological

Monographs, 7, 112–139.

Kitaoka, A. (2003). Available: http://www.ritsumei.ac.jp/

akitaoka/index-e.html.

Koenderink, J. J. (1984). The structure of images. Biological Cyber-

netics, 50, 363–370.

K€ohler, W., & Wallach, H. (1944). Figural after-effects: An investiga-

tion of visual processes. Proceedings of the American Philosophical

Society, 88, 269–357.

Kooi, F. L., Valois, K. K. D., Grosof, D. H., & Valois, R. L. D.

(1992). Properties of the recombination of one-dimensional motion

signals into a pattern motion signal. Perception and Psychophysics,

52, 415–424.

Kundt, A. (1863). Annalen der Physik und Chemie. Untersuchungen

€uber Augenmaß und optische T€auschungen. Pogg. Ann., 120, 118–

158.

Leibowitz, H., & Toffey, S. (1966). The effects of rotation and tilt on

the magnitude of the Poggendorff illusion. Vision Research, 6, 101–

103.

Lindeberg, T. (1994). Scale-space theory in computer vision. Boston:

Kluwer.

Lukiesh, M. (1922). Visual illusions. New York: Dover.

Marr, D., & Hildreth, E. C. (1980). A theory of edge detection.

Proceedings of the Royal Society of London B, 207, 187–217.

Mingolla, J., Todd, J., & Norman, J. (1992). The perception of

globally coherent motion. Vision Research, 32, 1015–1031.

Morgan, M. J. (1999). The Poggendorff illusion: A bias in the

estimation of the orientation of virtual lines by second-stage filters.

Vision Research, 39, 2361–2380.

Morgan, M. J., & Casco, C. (1990). Spatial filtering and spatial

primitives in early vision: An explanation of the Z€ollner–Judd class

of geometrical illusions. Proceedings of the Royal Society of London

B, 242, 1–10.

Morgan, M. J., & Glennerster, A. (1991). Efficiency of locating centres

of dot clusters by human observers. Vision Research, 31, 2075–

2083.

Morgan, M. J., Hole, G., & Glennerster, A. (1990). Biases and

sensitivities in the geometrical illusions. Vision Research, 30, 1793–

1810.

Morgan, M. J., & Hoptopf, N. (1989). Perceived diagonals in grids and

lattices. Vision Research, 29, 1005–1015.

Morgan, M. J., & Moulden, B. (1986). The M€unsterberg figure and

twisted cords. Vision Research, 26(11), 1793–1800.

Morinaga, S. (1933). Untersuchungen €uber die Z€ollnersche T€auschung.

Japanese Journal of Psychology, 8, 195–242.

M€uller-Lyer, F. C. (1896). Zur Lehre von den optischen T€auschungen:€Uber Kontrast und Konfluxion. Zeitschrift Fur Psychologie, 9, 1–

16.

Musatti, C. (1929). Sui fenomeni stereocinetici. Archivio Italiano di

Psicologia, 3, 105–120.

Nakayama, K., & Silverman, G. H. (1988a). The aperture problem––I:

Perception of nonrigidity and motion direction in translating

sinusoidal lines. Vision Research, 28, 739–746.

Nakayama, K., & Silverman, G. H. (1988b). The aperture problem––

II: Spatial integration of velocity information along contours.

Vision Research, 28, 747–753.

Oppel, J. J. (1855). €Uber geometrisch-optische T€auschungen (pp. 37–

47). Jahresbericht Phys. Ver. Frankfurt.

Orbison, W. (1939). Shape as a function of the vector field. American

Journal of Psychology, 52, 31–45.

Ouchi, H. (1977). Japanese and geometrical art. New York: Dover.

Oyama, T. (1960). Japanese studies on the so-called geometrical–

optical illusions. Psychologia, 3, 7–20.

Palmer, S. E. (1999). Vision science: photons to phenomenology.

Cambridge, MA: MIT Press.

Piaget, J. (1961). Les Mechanismes Perceptifs. Presses Universitaires de

Gravee, translated by G.N. Seagrim as The Mechanisms of

Perception, Routledge & Kegan Paul, London, 1969.

Pinna, B., & Brelstaff, G. J. (2000). A new visual illusion of relative

motion. Vision Research, 40(16), 2091–2096.

Robinson, J. O. (1972). The psychology of visual illusion. London:

Hutchinson.

Simoncelli, E. P., Adelson, E. H., & Heeger, D. J. (1991). Probability

distributions of optical flow. In Proceedings of the IEEE conference

on computer vision and pattern recognition, Maui, Hawaii (pp. 310–

315).

Smith, A. T., & Edgar, G. K. (1991). Perceived speed and direction of

complex gratings and plaids. Journal of the Optical Society of

America, 8, 1161–1171.

Page 23: Uncertaintyinvisualprocessespredictsgeometricalopticalillu ...users.umiacs.umd.edu/~fer/postscript/geo_journal.pdf · opticalillusions,andalsobyextension,illusorypatterns due to motion

C. Ferm€uller, H. Malm / Vision Research 44 (2004) 727–749 749

Stoner, G. R., Albright, T. D., & Ramachandran, V. S. (1990).

Transparency and coherence in human motion perception. Nature,

344, 153–155.

Tausch, R. (1954). Optische T€auschungen als artifizielle Effekte der

Gestaltungsprozesse von Gr€oßen und Formkonstanz in der

nat€urlichen Raumwahrnehmung. Psychologische Forschung, 24,

299–348.

Thi�ery, A. (1896). €Uber geometrisch optische T€auschungen. Philo-sophical Studies, 12, 67–126.

Virsu, V. (1971). Tendencies to eye movement, and misperception of

curvature, direction and length. Perception and Psychophysics, 9,

65–72.

Wagner, H. (1969). Simultaneous and successive contour displacements.

Ph.D. thesis, University of Wales.

Wallace, G. (1969). The critical distance of interaction in the Z€ollner

illusion. Perception and Psychophysics, 5, 261–264.

Wallace, G., & Crampin, D. (1969). The effects of background density

on the Z€ollner illusion. Vision Research, 9, 167–177.

Wallach, H. (1935). €Uber visuell wahrgenommene Bewegungsrichtung.

Psychologische Forschung, 20, 325–380.

Weiss, Y., & Adelson, E. H. (1998). Slow and smooth, a Bayesian

theory for the combination of local motion signals in human vision.

AI Memo 1616, MIT.

Witkin, A. P. (1983). Scale-space filtering. In Proceedings of the

international joint conference on artificial intelligence (pp. 1019–

1022).

Wundt, W. (1898). Akademie der S€achsischen Wissenschaften Leipzig,

Abhandlungen Die geometrisch-optischen T€auschungen. Abhandl.

mathphys. der sachs. Ges. Wiss., 24, 53–178.

Yo, C., & Wilson, H. R. (1992). Moving 2D patterns capture the

perceived direction of both lower and higher spatial frequencies.

Vision Research, 32, 1263–1270.

Yuille, A., & Poggio, T. (1986). Scaling theorems for zero-crossings.

IEEE Transactions on Pattern Analysis and Machine Intelligence, 8,

15–25.

Zeki, S. M. (1993). A vision of the brain. London: Blackwell.

Z€ollner, F. (1860). Annalen der Physik und Chemie. €Uber eine neue

Art von Pseudokopie und ihre Beziehungen zu den von Plateau and

Oppel beschriebenen Bewegungsph€anomenen. Ann. Phys. Chem.,

186, 500–523.