Ham Thesis

8/11/2019 Ham Thesis

1/192

Geometric Methods in Perceptual Image Processing

A dissertation presented

by

Hamilton Yu-Ik Chong

to

The School of Engineering and Applied Sciences

in partial fulfillment of the requirements

for the degree of

Doctor of Philosophy

in the subject of

Computer Science

Harvard University

Cambridge, Massachusetts

May 2008


2/192

c2008 - Hamilton Yu-Ik Chong

All rights reserved.


3/192


4/192

Abstract iv

derstanding. We single out suggestive contours and illumination valleys as particularly

interesting because although one is defined in terms of three-dimensional geometry and the

other in terms of image features, the two produce strikingly similar results (and effectively

convey a sense of shape). This suggests that the two types of curves capture similar pieces

of geometric information. To explore this connection, we develop some general techniques

for recasting questions about the image as questions about the surface.


5/192

Contents

Title Page . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . i

Abstract . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . iii

Table of Contents . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . v

Citations to Previously Published Work . . . . . . . . . . . . . . . . . . . . . . viii

Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ix

Dedication . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xi

1 Introduction 1

1.1 Perceptual Image Processing . . . . . . . . . . . . . . . . . . . . . . . . . 1

1.1.1 Color Constancy . . . . . . . . . . . . . . . . . . . . . . . . . . . 3

1.1.2 A Perception-based Color Space . . . . . . . . . . . . . . . . . . . 7

1.1.3 Shape Perception and Line Drawings . . . . . . . . . . . . . . . . 9

1.2 Contributions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 111.3 Overview of Thesis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13

2 A Basis for Color Constancy 16

2.1 Introduction and Previous Work . . . . . . . . . . . . . . . . . . . . . . . 16

2.2 Theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19

2.2.1 Measurement Constraints . . . . . . . . . . . . . . . . . . . . . . . 22

2.3 Color Basis for Color Constancy . . . . . . . . . . . . . . . . . . . . . . . 25

2.4 Relationship to Previous Characterizations . . . . . . . . . . . . . . . . . . 27

2.5 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30

2.5.1 Effective Rank of the World . . . . . . . . . . . . . . . . . . . . 30

2.5.2 Von Kries Sensors . . . . . . . . . . . . . . . . . . . . . . . . . . 312.5.3 White Patch Normalization . . . . . . . . . . . . . . . . . . . . . . 33

2.5.4 White Balancing . . . . . . . . . . . . . . . . . . . . . . . . . . . 38

2.5.5 Constrained Optimization . . . . . . . . . . . . . . . . . . . . . . 40

2.6 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43

3 A New Color Space for Image Processing 44

3.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45

v


6/192

Contents vi

3.2 Previous work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48

3.3 A Perceptual Metric . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50

3.3.1 Formalizing the Color Space Conditions . . . . . . . . . . . . . . . 50

3.3.2 Form of the Metric . . . . . . . . . . . . . . . . . . . . . . . . . . 51

3.3.3 Parameter Estimation . . . . . . . . . . . . . . . . . . . . . . . . . 52

3.4 Metric Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56

3.5 Applications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57

3.5.1 Segmentation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58

3.5.2 Poisson Image Editing . . . . . . . . . . . . . . . . . . . . . . . . 60

3.6 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65

4 Differential Geometry 67

4.1 Manifolds, Tensors, and Calculus . . . . . . . . . . . . . . . . . . . . . . . 68

4.1.1 Manifolds . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 694.1.2 Tensor Fields . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70

4.1.3 Integrals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79

4.1.4 Derivatives . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 83

4.2 Intrinsic and Extrinsic Properties of Surfaces . . . . . . . . . . . . . . . . 89

4.2.1 First Fundamental Form . . . . . . . . . . . . . . . . . . . . . . . 89

4.2.2 Second Fundamental Form . . . . . . . . . . . . . . . . . . . . . . 90

4.2.3 Tensors that live on the surface itself . . . . . . . . . . . . . . . . . 92

4.2.4 Higher Order Derivatives . . . . . . . . . . . . . . . . . . . . . . . 94

4.3 The Metric Equivalence Problem . . . . . . . . . . . . . . . . . . . . . . . 96

5 Shapes from Curves 985.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 99

5.2 Curve Definitions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 101

5.2.1 Surface-only Curves . . . . . . . . . . . . . . . . . . . . . . . . . 102

5.2.2 Environment-dependent Curves . . . . . . . . . . . . . . . . . . . 106

5.3 Relations Between Image and Surface Curves . . . . . . . . . . . . . . . . 116

5.3.1 Image Plane and Orthographic Projection . . . . . . . . . . . . . . 116

5.3.2 Critical Points of Illumination . . . . . . . . . . . . . . . . . . . . 124

5.3.3 Saint-Venant and Suggestive Energies . . . . . . . . . . . . . . . . 126

5.3.4 Suggestive Contours and Shading . . . . . . . . . . . . . . . . . . 128

5.4 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 129

6 Conclusions and Future Work 130

Bibliography 133

A Color Constancy Proofs 138

A.1 Conditions for Success . . . . . . . . . . . . . . . . . . . . . . . . . . . . 138


7/192

Contents vii

A.1.1 Proof of Proposition 1 . . . . . . . . . . . . . . . . . . . . . . . . 138

A.1.2 Proof of Lemma 1 . . . . . . . . . . . . . . . . . . . . . . . . . . 141

A.2 The Space of Solutions . . . . . . . . . . . . . . . . . . . . . . . . . . . . 145

B Color Space Proofs 147

B.1 Deriving the Functional Form . . . . . . . . . . . . . . . . . . . . . . . . . 147

B.2 Recovering Webers Law . . . . . . . . . . . . . . . . . . . . . . . . . . . 149

C More Differential Geometry 150

C.1 Non-coordinate Bases . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 150

C.2 Structure Equations for Surfaces . . . . . . . . . . . . . . . . . . . . . . . 154

C.3 Theorema Egregium . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 159

D Image and Surface Curves 160

D.1 Basic Facts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 160

D.2 Screen Coordinate Vector Fields . . . . . . . . . . . . . . . . . . . . . . . 166

D.3 Curve Relations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 168

D.4 Principal Directions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 175

D.5 Apparent Curvature . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 176

D.6 General Lighting . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 180


8/192

Citations to Previously Published Work

Large portions of Chapter 2 on color constancy have previously appeared in:

The von Kries Hypothesis and a Basis for Color Constancy.

H. Y. Chong, S. J. Gortler, T. Zickler.

InProceedings of ICCV 2007.

The perceptual optimization of a basis for color constancy (Section 2.5.5) and the develop-

ment of a new color space as presented in Chapter 3 have previously appeared in:

A Perception-based Color Space for Illumination-invariant

Image Processing.

H. Y. Chong, S. J. Gortler, T. Zickler.

InProceedings of SIGGRAPH 2008.

viii


9/192

Acknowledgments

I would first like to thank my advisor Professor Steven Gortler for his support

over the years (during both undergraduate and graduate days). From the beginning, he

gave me great freedom in choosing topics to pursue and provided excellent guidance on

how to approach any chosen problem. I would also like to thank Professors Todd Zickler,

Fredo Durand, Roger Brockett, and Craig Gotsman, all of whom have served on at least

one of my oral exam committees (and read my various scribbles). They have served as

wonderful sounding boards for my (often outlandish) ideas and thoughts. Thanks also to

Professors Doug DeCarlo and Szymon Rusinkiewicz for very helpful discussions on curves

and surfaces. And thanks as well to Brian Guenter for his mentorship at Microsoft Research

and for giving me the opportunity to broaden my research exposure and interact with the

amazing full-timers, postdocs, and fellow interns there.

Graduate school is of course a sometimes trying experience, and without the mu-

tual support of friendsall going through similar trials and tribulationsthe hurdles would

likely feel quite insurmountable. So special thanks to Michael Kellermann, Yi-Ting Huang,

Jimmy Lin, Daniel DeSousa, Eleanor Hubbard, Leo Nguyen, Michelle Gardner, James

Black, Mark Hempstead, and Camilo Libedinsky, who put up with my antics outside

of lab and joined me for sports and gaming. Many thanks as well to lab-mates Danil

Kirsanov, Guillermo Diez-Canas, Brenda Ng, Loizos Michael, Philip Hendrix, Ece Ka-

mar, Geetika Lakshmanan, Doug Nachand, Yuriy Vasilyev, Fabiano Romiero, Emmanuel

Turquin, Christopher Thorpe, Charles McBrearty, Zak Stone, Kevin Dale, Kalyan Sunkavalli,

Moritz Baecher, Miriah Meyer, Christian Ledergerber, and Forrester Cole, who made the

lab an inviting place. Thanks also to David Harvey, Wei Ho, and Ivan Petrakiev for their

ix


10/192

Acknowledgments x

math pointers (and conversations outside of math as well). Thanks also to my friends in

and around Saratoga.

And of course, profuse thanks goes to my family for their constant encourage-

ment (despite their teasing me about the cold California winters). I certainly would not

have gotten far at all without their support. I am reminded of a Chinese adage (popularized

by Yao Ming I believe): How do I thank my family? How does a blade of grass thank

the sun? Well, I do not have an answer to that question, so I will have to leave it at just

Thanks!

Thanks again to everyone (and apologies to the many other deserving people Ive

left unmentioned)!


11/192

Dedicated to my parents Fu-Chiung and Kathleen Chong,

to my brothers Sanders and Anthony Chong,

and to all teachers.

xi


12/192

Chapter 1

Introduction

1.1 Perceptual Image Processing

Perception may roughly be defined as the minds process of disentangling information from

the physical means by which it is conveyed. As such, the study of perception is inherently a

study of abstract representations. In distilling information into abstract quanta, the human

mind prepares such information for conscious processing and consumption. Invisualper-

ception, the inputs are retinal images and the outputs are mental descriptions (e.g., shape

and material properties) of the subject being observed.

On a spectrum of scales at which to probe human perception, the traditional

endeavors of cognitive psychologists and visual system neuroscientists may coarsely be

described as sitting at the two extremes. The former examines perceptual issues on a qual-

itative and macroscopic level while the latter identifies the microscopic biological building

1


13/192

Chapter 1: Introduction 2

blocks that allow for physical realization. Computational vision glues these two ends of the

spectrum together by providing an algorithmic description of how the functional building

blocks may give rise to the qualitative behaviors observed and classified through cognitive

experiments. Such a process (i.e., the algorithmic transformation of information) can be

studied independent of any of its physical instantiations, and hence falls within the purview

of computer science.

The aim of this dissertation is to propose new algorithmic models for aspects of

perceptual visual processing. While these models should replicate important features of

human vision, we by no means expect our models to fully describe reality. Rather, they are

simply meant to provide first order quantitative predictions; such predictions can then

be used to either bolster theories or point out assumptions in need of greater refinement.

One of our hopes is that these models might inform future experimental designs and aid

(even if only as a distracting, but somewhat informative, detour) in sciences ongoing and

successive approximations to truth.

Ultimately, however, our primary aim in this work is not to predict actual human

behavior. Our main interest is in the design of new algorithms that assist computers in at-

tacking various problems in graphics and vision. Motivation for this comes from the robust

manner in which the humans cope with environmental variations and successfully com-

plete a variegated set of visual tasks. By constructing new perception-inspired algorithms,

we may be able to endow computers with comparable robustness in accomplishing similar

goals. Therefore, the utility of these models will be judged by their usefulness in algorith-

mic problem solving; their value (as presented here) stands independent of the goodness

of our assumptions on how humans actually process visual information. In this work, we


14/192


seek progress on three problems: achieving color constancy, designing an illumination-

invariant color space, and relating projected image data to three-dimensional shape under-

standing.

1.1.1 Color Constancy

Color constancy refers to the fact that the perceived color of an object in a scene tends to

remain constant even when the spectral properties of the illuminant (and thus the tristimulus

data recorded at the retina) are changed drastically.

Figure 1.1: Arrows indicate pixel intensity values that have been replicated as uniform

squares of color outside the image context for side-by-side comparison with pixel intensity

values from the other image.

Figure 1.1 illustrates a local version of the effect. The left half of the figure

shows a set of puppets lit by normal daylight. The right half of the figure shows the same

set of puppets, but after an extreme environmental manipulation has been applied. While

no human viewer would consider pairs of spatially corresponding pixels between left and

right halves to be sharing the same color, even under this extreme change, the qualitative

experience of the colors is mostly preserved. For example, if asked to coarsely label the


15/192


color of the center puppets shirt on the right (marked by one of the arrows), one would

settle upon the descriptor yellow. However, when the pixel RGB values are pulled out

and displayed against a white background, one instead labels the same RGB values as

blue. The surprising fact is that we do not simply see the right half of Figure 1.1 as

consisting only of various shades of blue or purple. We still experience, for example,

sensations of yellow (which is widely considered in vision science the color most opposite

that of blue [46]).

Color constancy is even more powerful when the same environmental manipula-

tion is applied to our entire visual field (i.e., is applied globally rather than locally). In such

cases, we do not have the bright white of the paper (or the original image colors) to deduce

the overall blue-ness of the new environment. For a common, simulated example of such

a global manipulation, consider the eventual color shift that occurs after putting on a pair

of ski goggles. The act of putting on goggles can be thought of as simulating a change

of illuminant to one in which the there is less power in the short wavelengths. At first,

putting on a pair ski goggles with an orange filter makes the world appear orange. How-

ever, over time, the orange-ness disappears and colors are largely perceived as if goggles

had not been worn. When the goggles are later taken off, the world immediately appears

exceedingly blue until color perception once again shifts into an adapted state.

This phenomenon results from some kind of global processing that is still not

completely understood. One working model for achieving constancy is that the human first

somehow estimates the illuminant. The effect of illumination is then undone by applying

a tone mapping operator. This yields colors that are perceived as being lit under some

canonical illuminant. This interpretation may be referred to as illumination discounting


16/192


[18]. Another working model is that the human processes retinal images by relating colors

of observed objects. In converting retinal signals into perceived responses, the brain only

uses relationships that are not affected by illumination changes. Such an interpretation may

be referred to asspatial processing[18,43].

Whatever the mechanism, this perceptual processing allows color to be treated as

an intrinsic material property. From our day-to-day perceptual experiences, we accept it as

sensible to refer to an apple as red, or an orange as orange. This, however, belies the fact

that color constancy is actually hard to achieve. Recall that the light seen reflected from

objects depends on the full spectrum of the illuminant and the per-wavelength attenuation

due to the material. Light is only turned into three sensory responses once it hits the eye.

Therefore, given scenes with arbitrary illuminant spectra and material reflectances, there is

no reason that constancy should be even close to possible. Consider the scenario in which

two different materials appear the same under one illuminant (i.e., they aremetamers), but

look different under a second illuminant. In this case, one color maps to two distinct colors

with illuminant change, so color relations clearly do not remain constant.

Despite the manifest difficulty in achieving color constancy, models of color con-

stancy are often furthermore forced to obey an even more stringent constraint, the gener-

alized von Kries hypothesis [21]. This hypothesis stipulates that color correction is done

using independent scalings of the three tristimulus values. Von Kries commonly adopted

model assumes that the effect of changing illumination is to multiply each of our trichro-

matic sensor responses by a separate (possibly spatially varying) scale factor, and the brain

is thus able to discount the change by applying the inverse scale factor [18, 33, 46]. The

generalizedvon Kries model permits a change of color basis to take place before the gain


17/192


factors are applied.

With such demanding restrictions, color constancy may seem a vain endeavor. In-

deed, humans do not have perfect color constancy; however, our color percepts are nonethe-

less surprisingly stable [46]. This suggests that the worlds illuminants and reflectances

are not completely arbitrary. One natural question is then, What are the necessary and

sufficient conditions under which constancy is in fact achievable? Answers to this are

discussed in Chapter 2. There, we will focus mostly on the generalized von Kries model. A

related question is, To what extent are these conditions true in our everyday world? We

will present experimental tests of this as well. Note that these questions concern the gen-

eralpossibilityof achieving color constancy with any von Kries-like approach. Therefore,

answers to these questions provide blanket statements that upper bound what is attainable

for any method employing per-channel gain control for color constancy (e.g., grey world,

grey edge [54], gamut mapping [24], retinex [33]). They do not address what algorithm in

particular best achieves color constancy under these conditions.

Given that the necessary conditions for generalized von Kries models to achieve

perfect color constancy may not be met exactly, we would also like to compute an opti-

mal basis in which to run von Kries-based color constancy algorithms. The problem of

computing such a basis is addressed in Section 2.3. As a follow-up question we also pose

the following: To what extent is there evidence that humans process under the basis we

compute? This question is left unanswered in this thesis.

With regards to applications, models for color constancy can be applied to color

correction problems such as white balancing. In white balancing, a scene is captured under


18/192


one ambient illuminant. The image is then later displayed, but under viewing conditions

that employ a different ambient illuminant. The mismatch in illuminants during capture and

display, and hence mismatch in observer adaptations in both cases, causes discrepancies

in the perceived colors of the displayed reproduction. A white balanced image is one in

which the captured colors are mapped to colors as seen under the new display illuminant.

By displaying the white balanced image, a viewer perceives the reproduced scene in the

same way the one who captured the image perceived the original scene.

1.1.2 A Perception-based Color Space

Given the importance of color processing in computer graphics, color spaces abound.

While existing color spaces address a range of needs, they all suffer from one notable

problem that makes them unsatisfactory for a large class of applications including segmen-

tation and Poisson image editing [48]: algorithms working in these color spaces exhibit

great sensitivity to the illumination conditions under which their inputs were captured.

Consider Figure 1.2 in which an active contour segmentation algorithm is run

on two images that differ only in the scenes illumination (these were generated using full-

spectral images and measured illuminant spectra). Figure 1.2a shows the two initial images.

Figure 1.2b shows their segmentations in a typical color space (the space of RGB values

provided by the image). Figure 1.2c shows their segmentations in another common color

space (CIELab). In both color spaces, the segmentation parameters were tuned on each

image to produce as clean and consistent segmentations as possible. Furthermore, even

after parameter tweaking, the segmentations do not appear particularly reliable. It would


19/192


(a) Pair of images to segment. (b) One typical color space. (c) Another color space.

Figure 1.2: Active contour segmentation of two images in two different color spaces with

parameters tweaked for each image so that segmentations are as clean and consistent as

possible. Optimal parameters for the image pairs are separated by 2 and 5 orders of mag-

nitude respectively.

be beneficial to have a segmentation algorithm that could use the same set of parameters to

produce consistent segmentations of the two images in spite of their differing illuminations.

Figure 1.3 shows another example, this time of Poisson image editing. Here, a

picture of a bear is cut out from one image (Figure 1.3a) and inserted into a background

image of a swimming pool (Figure 1.3b). Figure 1.3c shows the result when run in a usual

color space (the space of RGB values provided by the image). In this image, the foreground

object is seamlessly inserted into the background image; however, the bear appears rather

faded and ghostly. For comparison, Figure 1.3d shows an insertion performed using the

color space we present in Chapter 3. Here the bear appears much more corporeal.

These examples point to the fact that algorithms working in current color spaces

lack an essential robustness to the illumination conditions of their inputs. If processing

is meant to operate on intrinsic scene properties, then such illumination-dependent results


20/192


(a) Object and mask. (b) Background. (c) Typical result. (d) More desirable result.

Figure 1.3: Poisson image editing.

are unsatisfactory. Since color constancy deals with taking measured colors and turning

them into stable percepts, addressing this problem is plausibly related to issues of color

constancy. Given an algorithm for color constancy, one solution might be to map an images

input colors to some canonical set of colors (i.e., colors as seen under a standard illuminant)

and then work in such a space instead. One potential problem with such an approach is that

it may depend on the ability of the algorithm to perform difficult tasks like estimating the

illuminant in order to map colors to their canonical counterparts. An even better scenario

would be if some ability to edit colors in an intrinsic sense were built into the choice of

color space itselfwithout requiring the adoption of a specific color correction algorithm.

1.1.3 Shape Perception and Line Drawings

Shape perception refers to the process of taking image data and inferring the three-dimensional

geometry of objects within the image. While there are many approaches to estimating shape

(e.g., shape from shading [29], shape from texture [5], shape from specular flow [2]), we

shall be interested in the connection between shape understanding and curves in an image.


21/192


(a) Silhouettes and suggestive contours. (b) Silhouettes only.

Figure 1.4: Line renderings of an elephant.

As shown in Figure 1.4a, line drawings alone are often enough to convey a sense of shape.

This suggests that curves themselves provide a lot of information about a surface. Artists

are able to make use of this fact and invert the process; they take a reference object, either

observed or imagined, and summarize it as a line drawing that is almost unambiguously

interpretable. Two natural questions then emerge: (1) Given a three-dimensional object,

what curves should one draw to best convey the shape? (2) Given an image, is there a

stable set of curves that humans can robustly detect and use to infer shapeand if so, how

are they defined and what precisely is their information content?

Analyses of the more classically defined curves on surfaces (e.g., silhouettes,

ridges, parabolic lines) are inadequate in giving a precise understanding to this problem.

For example, Figure 1.4b shows that the relationships between curves and shape perception

involve much more than simply silhouette data. Likewise can be said of the traditionally

defined ridges and valleys. Researchers have more recently defined curves such as sugges-

tive contours [15] and apparent ridges [32] which take advantage of view information and

produce informative renderings, but more precise analyses of when and why they succeed


22/192


are still lacking.

A complete understanding of the set of curves humans latch onto clearly ex-

tends far beyond the scope of this work; we instead focus on laying some groundwork for

future investigations. Our guiding philosophy is roughly as follows: humans only have

retinal images as inputs; assuming curve-based shape understanding is a learned process,

any meaningful curves should then be detectable (in a stable manner) in retinal images;

therefore, analysis should be developed for relating curves defined in images to geometric

properties of the viewed surface. Philosophy aside, the same geometric techniques devel-

oped for studying such relations may also be applied to answering other sorts of questions

relating curves on surfaces to curves in images, so we hope these techniques may prove

useful more generally.

1.2 Contributions

Color Constancy. Color constancy is almost exclusively modeled with von Kries (i.e.,

diagonal) transforms [21]. However, the choice of basis under which von Kries transforms

are taken is traditionallyad hoc. Attempts to remedy the situation have been hindered by

the fact that no joint characterization of the conditions for worlds 1 to support (generalized)

von Kries color constancy has previously been achieved. In Chapter 2, we establish the

following:

1By a world, we mean a collection of illuminants, reflectances, and visual sensors. These are typically

described by a database of the spectral distributions for the illuminants, per-wavelength reflections for the

materials, and per-wavelength sensitivities for the sensors.


23/192


Necessary and sufficient conditions for a world to support doubly linear color con-

stancy.

Necessary and sufficient conditions for a world to support generalized von Kries

color constancy.

An algorithm for computing a locally optimal color basis for generalized von Kries

color constancy.

In the applications we discuss, we are mostly concerned with the cone sensors of the human

visual system. However, the same theory applies for camera sensors and can be generalized

in a straightforward manner for a different number of sensors.

A Perception-based Color Space. In Chapter 3, we design a new color space with the

following properties:

It has a simple 3D parameterization. Euclidean distances approximately match perceptual distances.

Color displacements and gradients are approximately preserved under changes in the

spectrum of the illuminant.

Coordinate axes have an interpretation in terms of color opponent channels.

It can easily be integrated with fibered space and measure-theoretic image models.

The first three bullet points imply the color space can easily be plugged into many image

processing algorithms (i.e., algorithms can be called as black-box functions and require no

change in implementation). This color space also enables these algorithms to work on the


24/192


intinsic image and thereby yield (approximately) illumination-invariant results.

Shape Perception and Line Drawings. To study shape, we present some technical tools

in the context of relating Saint-Venant curves to suggestive contours. Contributions include

the following:

Choice of bases for surface and image that simplifies calculations and makes expres-

sions more conducive to interpretation.

Transplanting of techniques for working with bases that change at every point andnotation for reducing ambiguities.

General procedure for expressing image information in terms of three-dimensional

geometric quantities.

Some relations between suggestive contours, Saint-Venant curves, and surface ge-

ometry.

Relation between a surface-curves normal curvature in the tangent direction and its

apparent curvature in the image.

1.3 Overview of Thesis

The organization for the rest of the thesis is as follows:

Chapter 2. We analyze the conditions under which various models for color constancy can

work. We focus in particular on the generalized von Kries model. We observe that the von


25/192


Kries compatibility conditions are impositions only on the sensor measurements, not the

physical spectra. This allows us to formulate the conditions succinctly as rank constraints

on an order three measurement tensor. Given this, we propose an algorithm that computes

a (locally) optimal choice of color basis for von Kries color constancy and compare the

results against other proposed choices.

Chapter 3. Motivated by perceptual principlesin particular, color constancywe derive a

new color space in which the associated metric approximates perceived distances and color

displacements capture relationships that are robust to spectral changes in illumination. The

resulting color space can be used with existing image processing algorithms with little

or no change to the methods. We show application to segmentation and Poisson image

processing.

Chapter 4. This chapter presents the mathematical background for Chapter 5 (where we

consider the relations between shapes and image curves). The previous chapters make only

minor references to this chapter, so readers only interested in the color processing portions

of this work can simply refer to this chapter as required. Chapter 2 makes use of multilinear

algebra, so reading the notation used for denoting tensors is sufficient. Chapter 3 makes

reference to the metric equivalence problem discussed in Section 4.3. The more involved

details are not so important, so a high level overview suffices. Chapter 5 makes full use

of the presented differential geometry. So readers interested in that chapter should read

Chapter 4 with more care.

Chapter 5. We use the formalism presented in Chapter 4 to relate curves detectable in

images to information about the geometry. We focus in particular on Saint-Venant valleys


26/192


and relate them to suggestive contours. We develop techniques for relating image and

surface information and apply these to characterizing curve behavior at critical points of

illumination. We also prove more limited results away from critical points of illumination.

Appendix A. This appendix provides proofs for the results cited in Chapter 2. It also

contains an analysis of the structure of the space of worlds supporting generalized von

Kries color constancy.

Appendix B. This appendix presents the proof that locks down the functional form of

our color space parameterization. We also prove that our model recovers the well-known

Webers Law for brightness perception.

Appendix C. This appendix contains more detailed discussions on differential geometry

that are somewhat tangential to the main exposition. It includes some calculations that can

be used to verify some of the claims or get a better sense for how the formalism can be

used.

Appendix D. This appendix provides calculations and proofs for the various relations we

discuss in Chapter 5. It also details some further derivations that may be useful for future

work.


27/192

Chapter 2

A Basis for Color Constancy

In this chapter we investigate models for achieving color constancy.We devote particular

attention to the ubiquitous generalized von Kries model. In Section 2.2.1, we relate the

ability to attain perfect color constancy under such a model to the rank of a particular order

three tensor. For cases in which perfect color constancy is not possible, this relationship

suggests a strategy for computing an optimal color basis in which to run von Kries based

algorithms.

2.1 Introduction and Previous Work

For a given scene, the human visual system, post adaptation, will settle on the same per-

ceived color for an object despite spectral changes in illumination. Such an ability to dis-

cern illumination-invariant material descriptors has clear evolutionary advantages and also

16


28/192

Chapter 2: A Basis for Color Constancy 17

largely simplifies (and hence is widely assumed in) a variety of computer vision algorithms.

To achieve color constancy, one must discount the effect of spectral changes in

the illumination through transformations of an observers trichromatic sensor response val-

ues. While many illumination-induced transformations are possible, it is commonly as-

sumed that each of the three sensors reacts with a form of independent gain control (i.e.,

each sensor response value is simply scaled by a multiplicative factor), where the gain fac-

tors depend only on the illumination change [21, 33]. This is termed von Kries adaptation.

Represented in linear algebra, it is equivalent to multiplying each column vector of sensor

response values by ashareddiagonal matrix (assuming spatially uniform illumination), and

is therefore also referred to as the diagonal model for color constancy.

Note that while the initial von Kries hypothesis applied only to direct multiplica-

tive adjustments of retinal cone sensors, we follow [21] and use the term more loosely to

allow for general trichromatic sensors. We also allow for a change of color basis to occur

before the per-channel multiplicative adjustment. (Finlayson et al. [21] refer to this as a

generalized diagonalmodel for color constancy, and they term the change of color basis a

sharpening transform.)

The (generalized) diagonal model is at the core of the majority of color constancy

algorithms. Even a number of algorithms not obviously reliant on the diagonal assumption

in fact rely on diagonal models following a change of color basis [21, 22]; their choice of

color basis is simply not explicit. Yet, despite the widespread use of the diagonal model,

good choices of color bases under which diagonal transforms can be taken are only partially

understood.


29/192


30/192


While these limitations have been well-documented, a more complete character-

ization of the conditions for von Kries compatibility has yet to be established. As a result,

the development of more powerful systems for choosing optimized color bases has been

slow. This chapter addresses these issues by answering the following questions:

(1) What are the necessary and sufficient conditions that sensors, illuminants, and mate-

rials must satisfy to be exactly von Kries compatible, and what is the structure of the

solution space?

(2) Given measured spectra or labeled color observations, how do we determine the color

space that best supports diagonal color constancy?

We observe that the joint conditions are impositions only on the sensor measure-

ments, not the physical spectra. This allows the von Kries compatibility conditions to be

succinctly formulated as rank constraints on an order 3 measurement tensor. Our analysis

leads directly to an algorithm that, given labeled color data, computes a locally optimal

choice of color basis in which to carry out diagonal color constancy computations. The

proposed framework also unifies most existing analyses of von Kries compatibility.

2.2 Theory

We define two notions of color constancy. The first definition captures the idea that a

single adjustment to the sensors will map all material colors seen under an illuminantE1to

reference colors under (a possibly chosen standard) illuminant E2. The second definition

(also known asrelationalcolor constancy) captures the idea that surface colors have a fixed


31/192


relationship between each other no matter what overall illumination lights the scene. As

stated, these two definitions are not interchangeable. One being true does not imply the

other.

To define the issues formally, we need a bit of notation. Let Rbe the smallest

(closed) linear subspace ofL2 functions enclosing the spectral space of materials of interest.

Let E be the smallest (closed) linear subspace ofL2 functions enclosing the spectral space

of illuminants of interest. LetpR,E be the color (in the sensor basis) of material reflectance

R() R under illuminationE() E. In the following,D and Dare operators that takecolor vectors and map them to color vectors. D is required to be independent of the material

R; likewise, Dis required to be independent of the illuminant E. Thedenotes the action

of these operators on color vectors.

(1) Color constancy:

For allE1,E2E, there exists a D(E1,E2)such that for allR

R,

pR,E2 =D(E1,E2)pR,E1

(2) Relational color constancy:

For allR1,R2 R, there exists a D(R1,R2)such that for allE E,

pR2,E = D(R1,R2)pR1,E

In the case that D and D are linear (and hence identified with matrices),

is just matrix-

vector multiplication. IfD is linear, we say that the world supports linear adaptive color

constancy. If D is linear, we say the world supports linear relational color constancy. D

being linear does not imply Dis linear, and vice versa. If both D and Dare linear, we say

the world supportsdoubly linearcolor constancy.


32/192


(1)

(2)(3) ...

(1)

(2)(3)...

(1)

(2)

(3)

ji

k

ji

k

ji

k

Figure 2.1: The 3xIxJ measurement tensor. The tensor can be sliced in three ways to

produce the matrices (j), (i), and (k).

In particular, we shall be interested in the case when D and D are both further-

morediagonal(under some choice of color basis). It is proven in [21] that for a fixed color

space, D is diagonal if and only if D is diagonal. So the two notions of color constancy

are equivalent if eitherD or D is diagonal, and we say the world supports diagonal color

constancy (thedoubly modifier is unnecessary). The equivalence is nice because we, as

biological organisms, can likely learn to achieve definition 1, but seek to achieve definition

2 for inference.

Given a set of illuminants {Ei}i=1,...,I, reflectances {Rj}j=1,...,J, and sensor color

matching functions {k}k=1,2,3, we define a measurement data tensor2 (see Figure 2.1):

Mki j:=

k()Ei()Rj()d (2.1)

For fixed values of j, we get 3xI matrices (j) := Mk

i j that map illuminants

expressed in the {Ei}i=1,...,Ibasis to color vectors expressed in the sensor basis. Likewise,

for fixed values ofi, we get 3xJmatrices (i):=Mk

i j that map surface reflectance spectra

expressed in the {Rj}j=1,...,Jbasis to color vectors. We can also slice the tensor by constant2We will use latin indices starting alphabetically with i to denote tensor components instead of the usual

greek letters to simplify notation and avoid confusion with other greek letters floating around.


33/192


0

0

0

Figure 2.2: Core tensor form: a 3x3x3 core tensor is padded with zeros. The core tensor is

not unique.

kto getIxJmatrices (k) :=Mki j.

Since color perception can depend only on the eyes trichromatic color measure-

ments, worlds (i.e., sets of illuminant and material spectra) giving rise to the same measure-

ment tensor are perceptually equivalent. To understand diagonal color constancy, therefore,

it is sufficient to analyze the space of measurement tensors and the constraints that these

tensors must satisfy. This analysis of von Kries compatible measurement tensors is covered

in section 2.2.1.

Given a von Kries compatible measurement tensor (e.g., an output from the al-

gorithm in section 2.3), one may also be interested in the constraints such a tensor places

on the possible spectral worlds. This analysis is covered in section 2.4.

2.2.1 Measurement Constraints

The discussion in this section will always assume generic configurations (e.g., color mea-

surements span three dimensions, color bases are invertible). Proofs not essential to the

main exposition are relegated to Appendix A.

Proposition 1. A measurement tensor supports doubly linear color constancy if and only


34/192


if there exists a change of basis for illuminants and materials that reduces it to the core

tensor form of Figure 2.2.

More specifically (as is apparent from the proof of Proposition 1 in Appendix A.1.1), if

a single change of illuminant basis makes all the (j) slices null past the third column,

the measurement tensor supports linear relational color constancy. Likewise, a change of

material basis making all the (i)slices null past the third column implies the measurement

tensor supports linear adaptive color constancy. Support for one form of linear constancy

does not imply support for the other.

The following lemma provides a stepping stone to our main theoretical result and

is related to some existing von Kries compatibility results (see section 2.4).

Lemma 1. A measurement tensor supports generalized diagonal color constancy if and

only if there exists a change of color basis such that, for all k, (k) is a rank-1 matrix.

This leads to our main theorem characterizing the space of measurement tensors

supporting generalized diagonal color constancy.

Theorem 1. A measurement tensor supports generalized diagonal color constancy if and

only if it is a rank 3 tensor.3

An order 3 tensor (3D data block) T is rankN ifNis the smallest integer such

that there exist vectors4

{an,bn,cn

}n=1,...,N allowing decomposition as the sum of outer

products (denoted by ):3There exist measurement tensors supporting generalized diagonal color constancy with rank less than 3,

but such examples are not generic.4In the language of differential geometry introduced in Chapter 4,{an,bn}n=1,...,N are really covector

component lists.{cn}n=1,...,Nare really vector component lists.


35/192


T=N

n=1

cnanbn. (2.2)

Without loss of generality, let{an} be vectors of length I, corresponding to the

illuminant axis of the measurement tensor; let{bn} be vectors of length J, corresponding

to the material axis of the tensor; and let {cn} be vectors of length 3, corresponding to the

color sensor axis of the tensor. Let the vectors {an} make up the columns of the matrix A,

vectors

{bn}

make up the columns of the matrix B, and vectors

{cn}

make up the columns

of the matrixC. Then the decomposition above may be restated as a decomposition into

the matrices (A,B,C), each withNcolumns.

Proof. (Theorem 1). First suppose the measurement tensor supports generalized diagonal

color constancy. Then by Lemma 1, there exists a color basis under which each (k) is rank-

1 (as a matrix). This means each (k) can be written as an outer product, (k) =akbk.

In this color basis then, the measurement tensor is a rank 3 tensor in which the matrixC

(following notation above) is just the identity.5 We also point out that an invertible change

of basis (on any ofA,B,C) does not affect the rank of a tensor, so the original tensor (before

the initial color basis change) was also rank 3.

For the converse case, we now suppose the measurement tensor is rank 3. Since

Cis (in the generic setting) invertible, multi-linearity gives us

C1

3

n=1

cn an bn

=3

n=1

C1cn

an bn. (2.3)5Technically, this shows that the tensor is at most rank 3, but since we are working with generic tensors,

we can safely discard the degenerate cases in which observed colors do not span the three-dimensional color

space.


36/192


The operator on the left hand side of Equation (2.3) denotes the application of the 3 3

matrixC1 along the sensor axis of the tensor. The right hand side of Equation (2.3) is

a rank 3 tensor with each (k) slice a rank-1 matrix. By Lemma 1, the tensor must then

support diagonal color constancy.

In the proof above, note that the columns ofC exactly represent the desired color basis

under which we get perfect diagonal color constancy. This theorem is of algorithmic im-

portance because it ties the von Kries compatibility criteria to quantities (best rank 3 tensor

approximations) that are computable via existing multilinear methods.

2.3 Color Basis for Color Constancy

Given a measurement tensorMgenerated from real-world data, we would like to find the

optimal basis in which to perform diagonal color constancy computations. To do this,

we first find the closest von Kries compatible measurement tensor (with respect to the

Frobenius norm). We then return the color basis that yields perfect color constancy under

this approximate tensor.

By Theorem 1, finding the closest von Kries compatible measurement tensor is

equivalent to finding the best rank 3 approximation. Any rank 3 tensor may be written in

the form of equation (2.2) withN=3. It also turns out that such a decomposition of a rank

three tensor into these outer-product vectors is almost always unique (modulo permutations

and scalings). We solve forMs best rank 3 approximation (decomposition into A, B, C)

via Trilinear Alternating Least Squares (TALS) [28]. For a rank 3 tensor, TALS forcesA,


37/192


B, and Cto each have 3 columns. It then iteratively fixes two of the matrices and solves for

the third in a least squares sense.

Repeating these computations in lockstep guarantees convergence to a local min-

imum. A, B,Ccan be used to reconstruct the closest von Kries compatible tensor and the

columns ofCexactly represent the desired color basis.

As a side note, the output of this procedure differs from the best rank-(3,3,3)

approximation given by HOSVD [37]. HOSVD only gives orthogonal bases as output and

the rank-(3,3,3) truncation does not in general yield a closest rank 3 tensor. HOSVD may,

however, provide a good initial guess.

The following details on TALS mimic the discussion in [52]. For further infor-

mation, see [28, 52] and the references therein. The Khatri-Rao product of two matricesA

andB withNcolumns each is given by

AB:=a1b1,a2b2, ,aNbN

, (2.4)

where is the Kronecker product.

Denote the flattening of the measurement tensorMby MIJ3 if the elements ofM

are unrolled such that the rows of matrix MIJ3 loop over the(i,j)-indices withi=1,...,I

as the outer loop and j= 1,...,Jas the inner loop. The column index of MIJ3 corresponds

with the dimension of the measurement tensor that is not unrolled (in this casek=1,2,3).


38/192


The notation for other flattenings is defined symmetrically. We can then write

MJI3 = (BA)CT. (2.5)

By symmetry of equation (2.5), we can write out the least squares solutions for each of the

matrices (with the other two fixed):

A = (BC) MJ3IT , (2.6)

B =

(CA) M3IJT

, (2.7)

C =

(BA) MJI3T

. (2.8)

2.4 Relationship to Previous Characterizations

As mentioned in the introduction, there are two main sets of theoretical results. There are

the works of [20,58] that give necessary and sufficient conditions for von Kries compatibil-

ity under a predetermined choice of color space, and are able to build infinite dimensional

von Kries compatible worlds for this choice. Then there are the works of [21, 22] that

prescribe a method for choosing the color space, but only for worlds with low dimen-

sional linear spaces of illuminants and materials. We omit direct comparison to the various

spectral sharpening techniques [6, 16, 23] in this section, as these methods propose more

intuitive guidelines rather than formal relationships.

Previous analyses treat the von Kries compatibility conditions as constraints on


39/192


spectra, whereas the analysis here treats them as constraints on color measurements. In this

section, we translate between the two perspectives. To go from spectra to measurement

tensors is straightforward. To go the other way is a bit more tricky. In particular, given a

measurement tensor with rank-1 (k), there is not a unique world generating this data. Any

set of illuminants{Ei}i=1,...,I and reflectances{Rj}j=1,...,J satisfying Equation (2.1) (with

M and k fixed) will be consistent with the data. Many constructions of worlds are thus

possible. But if one first selects particular illuminant or material spectra as mandatory in-

clusions in the world, then one can state more specific conditions on the remaining spectral

choices. For more on this, see Appendix A.2.

In [21, 22], it is shown that if the illuminant space is 3 dimensional and the ma-

terial space is 2 dimensional (or vice versa), then the resulting world is (generalized) von

Kries compatible. As a measurement tensor, this translates into stating that any 3x3x2 mea-

surement tensor is (complex) rank 3. However this 3-2 condition is clearly not necessary

as almost every rank 3 tensor is not reducible via change of bases to size 3x3x2. In fact, one

can always extend a 3x3x2 tensor to a 3x3x3 core tensor such that the (k) are still rank-1.

The illuminant added by this extension is neither black with respect to the materials, nor in

the linear span of the first two illuminants.

The necessary and sufficient conditions provided in [58] can be seen as special

cases of Lemma 1. The focus on spectra leads to a case-by-case analysis with arbitrary

spectral preferences. However, the essential property these conditions point to is that the

2x2 minors of(k) must be zero (i.e., (k) must be rank-1).

One case from [58] is explained in detail in [20]. They fix a color space, a space


40/192


0

0

0

0

0

0

**

*

0

0

0

0

0

0

**

*

0

0

0

0

0

0

**

*

0

0

0

0

0

0

**

*

j

i

k

Figure 2.3: The rows of a single (1)slice are placed into a new measurement tensor (rows

are laid horizontally above) with all other entries set to zero. The marks the nonzeroentries.

of material spectra, and a single reference illumination spectrum. They can then solve for

the unique space of illumination spectra that includes the reference illuminant and is von

Kries compatible (in the fixed color basis) with the given material space.

In our framework, this can be interpreted as follows. The given input gives rise to

a single (1) measurement slice. The three rows of this slice can be pulled out and placed

in a new measurement tensor of the form shown in Figure 2.3. This measurement tensor is

then padded with an infinite number of zero (i)matrices. The (k) slices of this new tensor

are clearly rank-1 matrices, and thus this tensor is von Kries compatible in the given color

space. Moreover, any measurement tensor with rank-1(k) that include the original (1)

slice in its span must have (i) slices that are spanned by the (i)slices in Figure 2.3. With

this fixed tensor and the fixed material spectra, one can then solve Equation (2.1) to obtain

the space of compatible illumination spectra. This space can be described by three non-

black illuminants and an infinite number of black illuminants (giving zero measurements

for the input material space). Since the original(1) measurement slice is in the span of

the (i)slices, the original reference illuminant must be in the solution space.


41/192


2.5 Results

We used the SFU color constancy dataset to create a measurement tensor to use in our ex-

periments. The SFU database provides 8 illuminants simulating daylight, and 1,995 mate-

rials including measured spectra of natural objects. Fluorescent spectra were removed from

the dataset in hopes of better modeling natural lighting conditions since, in color matching

experiments, fluorescent lamps cause unacceptable mismatches of colored materials that

are supposed to match under daylight [60].

Color matching functions were taken to be CIE 1931 2-deg XYZ with Judd 1951

and Vos 1978 modifications [60]. To resolve mismatches in spectral sampling, we inter-

polated data using linear reconstruction. Cone fundamentals were taken to be the Vos and

Walraven (1971) fundamentals [60]. Experiments were run with illuminant spectra nor-

malized with respect to the L2 norm.

We followed the strategy of Section 2.3 to produce optimized color spaces for

von Kries color constancy.

2.5.1 Effective Rank of the World

To test the effective rank of our world (as measured by available datasets), we approxi-

mate the SFU measurement tensor with tensors of varying rank and see where the dropoff

in approximation error occurs. Figure 2.4 shows the results from the experiment. We mea-

sured error in CIEDE2000 units in an effort to match human perceptual error. CIEDE2000

is the latest CIE standard for computing perceptual distance and gives the best fit of any


42/192


RMS Error of SFU Tensor Approximation

0

5

10

15

20

25

1 2 3 4

approximating tensor's rank

colorRMSerror(CIEDE2000)

Figure 2.4: The effective rank of the SFU dataset is about 3. The green dot marks the error

for the rank 3 tensor in which the color gamut was constrained to include the entire set of

human-visible colors (see Section 2.5.5).

method to the perceptual distance datasets used by the CIE [40]. As a rule of thumb, 1

CIEDE2000 unit corresponds to about 1 or 2 just noticeable differences.

The red curve in Figure 2.4 shows that the SFU measurement tensor is already

quite well approximated by a rank 3 tensor. The rank 2 tensors approximation error is

68.3% of the rank 1 tensors approximation error. The rank 3 tensors approximation error

is 6.7% of the rank 2 tensors approximation error. The rank 4 tensors approximation error

is 31.3% of the rank 3 tensors approximation error. We discuss the meaning of the green

dot later in Section 2.5.5.

2.5.2 Von Kries Sensors

The matrix mapping XYZ coordinates to the new color coordinates, as computed using the

TALS optimization on the SFU database, is given by the following (where the rows have


43/192


ANLS Effective Response

0.0

0.2

0.4

0.6

0.8

1.0

380 430 480 530 580 630 680

wavelength (nm)

normalizedresponse

Cone Response

0.0

0.2

0.4

0.6

0.8

1.0

380 430 480 530 580 630 680

wavelength (nm)

normalizedre

sponse

CIE XYZ Response

0.0

0.2

0.4

0.6

0.8

1.0

380 430 480 530 580 630 680

wavelength (nm)

normalizedre

sponse

TALS Effective Response

-0.5

0.0

0.5

1.0

380 430 480 530 580 630 680

wavelength (nm)

normalizedresponse

Figure 2.5: Color matching functions. Top row shows the standard CIE XYZ and Cone

matching functions. The bottom row shows the effective sensor matching functions result-

ing from applying the Cmatrix derived from optimizations on the SFU database. ANLS

(alternating nonlinear least squares) constrains the color space gamut to contain the human-

visible gamut and measures perceptual error using CIEDE2000.

been normalized to unit length):

C1 =

9.375111101 3.197979101 1.371214101

4.783729101 8.779015101 2.117135102

8.338662102 1.235156101 9.888329101

. (2.9)

We ran our algorithm on the Joensuu database as well [45]. The Joensuu database provides

22 daylight spectra and 219 natural material spectra (mostly flowers and leaves). The re-

sulting basis vectors (columns ofC) were within a couple degrees (in XYZ space) of the

SFU optimized basis vectors. This seems to suggest some amount of stability in the result.

The effective sensor matching functions given by the optimized color basis are

shown in Figure 2.5. As is intuitively suspected, the optimized basis causes a sharpening


44/192


in the peaks of the matching functions. This expectation is motivated by the fact that the

diagonal model is exact for disjoint sensor responses. The ANLS result will be discussed

in Section 2.5.5.

2.5.3 White Patch Normalization

There are many experiments one might devise to measure the color constancy afforded by

various color bases. We chose to replicate a procedure commonly used in the literature.

This experiment is based on a white patch normalization algorithm, and is described

below.

Dataset and algorithms

We ran our color basis algorithm on the SFU dataset [7] and compared our resulting color

basis against previous choices (the cone sensor basis, 4 bases derived from different low

dimensional approximations of spectral data, that of Barnard et. al. [6], and the sensor

sharpened basis [23]).

The low dimensional worlds to which we compare are taken to have either 3

dimensional illuminant spaces and 2 dimensional material spaces (a 3-2 world) or vice

versa (a 2-3 world); this allows computing color bases via the procedure in [21].

We took two different approaches to approximating spectra with low dimensional

vector spaces. In the first approach (described in [21, 23]), we run SVD on the illuminant

and material spectra separately. We then save the best rank-3 and rank-2 approximations.


45/192


This is Finlaysons perfect sharpening method for databases with multiple lights [23],

and is one of the algorithms that falls under the label of spectral sharpening.

As pointed out in [42], if error is to be measured in sensor space, there are alter-

natives to running PCA on spectra. Given a measurement tensor, the alternative (tensor-

based) approach instead applies SVD on the tensor flattenings MJ3I and M3IJ to get the

principal combination coefficients of the spectral bases (to be solved for) that approximate

the sample spectra. Refer to [42] for details.

We label the algorithm of Barnard et. al. [6] as Barnard. This is a more recent

algorithm and also falls under the label of spectral sharpening.

We also ran experiments against Finlaysons sensor sharpening algorithm. Note

that this algorithm does not actually use the database in determining the sensor transforms.

Its heuristic is simply to transform the sensors so that the responses are as sharp as pos-

sible.

We also tested against a modified version of Finlaysons database sharpening

method [23]. This algorithm as stated is defined in the case when the database has two

illuminants and possibly many materials. Since our database has multiple lights, we used

PCA on the set of illuminant spectra and ran the algorithm using the two most dominant

principal components as our two illuminants. The results were nearly identical to both of

the 2-3 methods, and we therefore omit the corresponding curves from the graphs.


46/192


Experimental procedure

We run the same white-patch normalization experiment as in [21]. As input, we are given

a chosen white material Wand an illuminant E. For the SFU database, we used the only

material labeled as white as our white material W. For every other materialR, we compute

a descriptor by dividing each of its 3 observed color coordinates by the 3 color coordinates

of W (the resulting 3 ratios are then transformed as a color vector to XYZ coordinates so

that consistent comparisons can be made with different choices of color space). In a von

Kries world, the descriptor for R would not depend on the illuminant E. To measure the

non von Kries-ness of a world, we can look at how much these descriptors vary with the

choice ofE.

More formally, we define the desriptor as:

d

W,R

E = C

diag

C1

p

W,E1C

1

p

R,E

(2.10)

The functiondiagcreates a matrix whose diagonal elements are the given vectors compo-

nents.Cis a color basis. C,pR,E,pW,E are given in the CIE XYZ coordinate system.

This means we compute the color vectorspW,E andpR,E as:

(pW,E

)k

:=

k

()E()W()d (2.11)

(pR,E)k :=

k()E()R()d (2.12)

with 1() = x() =CIE response function forX, 2() = y() =CIE response function


47/192


forY, and 3() = z() = CIE response function for Z. The columns of the matrix Care

the (normalized) basis vectors of a new color space in XYZ coordinates. C1 is the inverse

ofC. This means that given some color vector v in XYZ coordinates,C1vcomputes the

coordinates of the same color vector expressed in the new color spaces coordinate system.

To compute a non von Kries-ness error, we fix a canonical illuminantEand com-

pute descriptors dW,R

E for every test materialR. We then choose some different illuminant

Eand again compute descriptors dW,R

E for every test materialR. Errors for every choice of

EandR are computed as:

Error=100 ||d

W,RE dW,RE ||||dW,R

E ||(2.13)

For each instance of the experiment, we choose one SFU test illuminant E and

compute the errors over all test materials (canonical illuminant E is kept the same for

each experimental instance). Each time, the color basis derived from running our method

(labeled as Optimized) performed the best.

To visualize the results of the experimental test phase, we plot histograms of %

color vectors that are mapped correctly with a diagonal transform versus the % allowable er-

ror. Hence, each histogram plot requires the specification of two illuminantsone canonical

and one test. Figure 2.6 shows the cumulative histograms for instances in which the stated

basis performs the best and worst relative to the next best basis. Relative performance be-

tween two bases is measured as a ratio of the areas under their respective histogram curves.

The entire process is then repeated for another canonicalE to give a total of 4 graphs.


48/192


Figure 2.6: Percent vectors satisfying von Kries mapping versus percent allowable error.

Each curve represents a different choice of color space. For low dimensional worlds, the

dimension of the illuminant space precedes the dimension of the material space in the

abbreviated notation. Low dimensional approximations were obtained either by runningPCA on spectra or by tensor methods described in text. We show the experimental instances

in which our derived basis performs the best and worst relative to the next best basis. The

left and right halves differ in choice of canonical illuminant for testing. Unlike Barnard,

our method effectively optimizes all pairs of lights.

White patch normalization results

The optimization algorithm labeled as Barnard requires specification of a canonical illu-

minant and chooses a color basis such that the mapping between any other test illuminant

and the canonical one is as diagonal as possible (in a least squares sense).

To be fair, we show two sets of results: one where the canonical illuminant during

test matches the canonical illuminant used in Barnard optimization; and one set where a

different canonical illuminant is chosen during test from that used in Barnard optimization.

The second canonical illuminant is chosen to illustrate the best worst-case relative perfor-

mance of our algorithm. Goodness is measured as the ratio of areas under the histogram

curves.

Barnards algorithm performs close to ours for some pairings of test and canoni-


49/192


cal illuminants, but is outperformed in most cases. These results are explained theoretically

by the fact that, even though it optimizes over multiple light-pairs, the method of Barnard

et al. always optimizes with respect to a single canonical illuminant. In contrast, we effec-

tively optimize over all possible pairs of lights.

As noted earlier, we also tested against Finlaysons database sharpening method

[23] (using PCA on lights to handle multiple lights). The results were nearly identical to

both of the 2-3 methods.

2.5.4 White Balancing

In this section, we discuss a rough perceptual validation we performed. We first down-

loaded a hyperspectral image from the database provided by [25]. Each pixel of the hyper-

spectral image gives a discretized set of samples for the per-wavelength reflectance function

of the material seen at that pixel. The particular image we used was that of a flower shown

in Figure 2.7. Assuming a lighting model in which viewed rays are simply computed as the

product of the illuminant and reflectance spectra in the wavelength domain, we rendered the

scene as it would appear under a tungsten light bulb and as it would appear under daylight.

For visualization we converted all images into RGB coordinates and gamma corrected each

coordinate before display. The goal of the algorithm was to start with the input image of

the flower under a tungsten light and transform it into the target image of the flower under

daylight using a generalized diagonal model. The goodness of a color basis is judged by the

degree to which it facilitates the transformation of the input into the desired output (with

the significance of discrepancies determined by human perceptual error).


50/192


(a) Input. (b) Target.

(c) Cone. (d) 2-3 Tensor. (e) Ours.

Figure 2.7: Color correction example. Diagonal mappings are used in three different color

spaces to attempt a mapping of the input image (a) to the target illumination whose truth

value is shown in (b).

We ran three different algorithms (Cone, 2-3 tensor, and Ours) on the SFU database

to derive three optimized color bases for diagonal color constancy. Instead of choosing one

particular algorithm for then computing diagonals, we wanted to characterize some notion

of best achievable performance under each basis. We simulated the best learned di-

agonals with the following steps: transform the SFU derived measurement tensor by C1

to obtain color coordinates in the candidate color basis; then for each material, compute

the diagonal matrix mapping the color coordinates under the tungsten light bulb to the

corresponding material color under daylight illumination (just component-wise ratios); fi-

nally, average the diagonal matrices over all materials to get the overall diagonal mapping

between the two illuminants under the candidate color basis.


51/192


Figure 2.7 shows the output images for the three different color bases. The dif-

ferences are subtle, but one can see that the cone basis yields leaves that are too green, and

the 2-3 tensor method yields a flower that has too much blue (not enough of the red and

yellow of the flower is present). Our method provides the best match to the ideal target

image.

2.5.5 Constrained Optimization

For miscellaneous reasons, we may also want to constrain our optimized solution in other

ways. For example, in Chapter 3 we seek a color basis in which the implied color gamut

(collection of colors that have only positive components in the color basis) encompasses

the entire set of human-visible colors. There, we also choose to measure perceptual error

using the CIEDE2000 difference equation instead of using the standard 2 error in a linear

color space. These extra constraints make each step of the alternating process of Section

2.3 a nonlinear least squares optimization. To handle the numerics, we used the Levenberg-

Marquardt nonlinear least squares algorithm as implemented by [38] along with a wrapper

function that enforces constraints.

The green dot in Figure Figure 2.4 represents the approximation error for the

rank 3 tensor in which the color basis was constrained such that the implied color gamut

would include the entire set of human-visible colors. If we use the green dot in place of

the red dot for the rank 3 approximation, we still get a good approximation of the SFU

tensor. The constrained rank 3 tensors approximation error is 10.7% of the rank 2 tensors

approximation error. The rank 4 tensors approximation error is 19.5% of the constrained


52/192


rank 3 tensors approximation error. We do not plot green dots for other tensor ranks

because the gamut constraint is not so well-defined in those cases.

The color basis transform resulting from the constrained optimization is given by

the following (which maps XYZ color vectors to the new color space):

C1 =

9.465229101 2.946927101 1.313419101

1.179179101 9.929960101 7.371554103

9.230461102

4.645794102

9.946464101

(2.14)

See Figure 2.5 for the effective von Kries sensors, which are labeled as Alternating Non-

linear Least Squares (ANLS).

In Chapter 3, we are particularly interested in the ability of the diagonal model to

describe the effect of illumination change in the constrained color basis (whose gamut in-

clues all the human-visible colors). Unlike the white patch normalization experiment which

measured some notion of relational color constancy, we would like to directly measure the

goodness of diagonal color constancy itself. To do this, we simulate a perfect white bal-

ancing algorithm that perfectly maps the color of a standard white material under one

illuminant to the color of the same standard material seen under a different illuminant.

In a von Kries compatible world, the same diagonal used to perform this mapping would

correctly map all material colors under the initial illuminant to their corresponding colors

under the second illuminant. We therefore apply the same mapping to each of the material

colors under the first illuminant to derive predicted colors for the second illuminant. De-

viations from the actual measured colors under the second illuminant then give a measure


53/192


Median and RMS Diagonal Color Constancy Error for Constrained Color Basis

Tung. Tung.+ 3500K 3500K+ 4100K 4100K+ 4700K 4700K+

Tung. 1.67 0.54 2.23 1.14 2.72 1.74 3.202.28 0.84 2.98 1.54 3.63 2.31 4.34

Tung.+ 1.46 1.07 0.54 0.60 1.02 0.31 1.54

2.32 1.69 0.80 0.97 1.40 0.55 2.11

3500K 0.51 1.13 1.68 0.57 2.16 1.19 2.67

0.85 1.60 2.24 0.79 2.92 1.59 3.65

3500K+ 1.88 0.52 1.53 1.07 0.58 0.66 1.14

2.97 0.80 2.32 1.67 0.77 1.09 1.55

4100K 0.96 0.60 0.52 1.11 1.60 0.61 2.10

1.53 0.88 0.82 1.53 2.18 0.84 2.89

4100K+

2.35 1.04 2.05 0.60 1.62 1.14 0.58

3.58 1.44 3.01 0.79 2.39 1.71 0.81

4700K 1.41 0.29 1.05 0.63 0.58 1.10 1.59

2.17 0.49 1.55 0.96 0.82 1.51 2.17

4700K+ 2.84 1.60 2.57 1.23 2.18 0.61 1.70

4.24 2.18 3.77 1.58 3.16 0.80 2.43

Table 2.1: Results from the white balancing experiment. Top error is median error, bottom

error is RMS error. Errors are measured in CIEDE2000 units. Row illuminant is the one

that is chosen as canonical. The illuminants are taken from the SFU dataset: Tung. is

a basic tungsten bulb (Sylvania 50MR16Q 12VDC); the different temperatures correspond

to Solux lamps of the marked temperatures; the + means that a Roscolux 3202 Full Blue

filter has been applied.

of the non-von Kries-ness of the world. This experiment also has the advantage of allow-

ing error to be measured via the CIEDE2000 difference equation, giving some perceptual

meaning to the resulting numbers. This also allows us to characterize performance in some

absolute sense.

More specifically, we use the following procedure to report results. We choose

some illuminant as a canonical illuminant. We also choose the one material in the SFU

database labeled as white for our standard white material. Then, for every other (test)

illuminant, we compute the mapping that perfectly maps the standard white under the


54/192


canonical illuminant to its color under the test illuminant and is diagonal in our constrained

color basis. We then apply this mapping to all the material colors under the canonical illu-

minant to generate predictions for colors under the test illuminant. Errors between predic-

tions and actual measured values in the SFU database are measured using the CIEDE2000

difference equation. This gives us, for each test illuminant, an error value associated with

every material. So for each test illuminant, we report the median CIEDE2000 error and the

root-mean-squared CIEDE2000 error of the material colors. See Figure 2.1 for the results.

2.6 Discussion

We have argued for a new data-driven choice of color basis for diagonal color constancy

computations. We show that with respect to some existing metrics, the new choice leads to

a better diagonal model.

While a linear change of color basis poses no problem to those concerned sim-

ply with algorithmic modeling, those who seek relevance to human biological mechanisms

might object (on theoretical grounds) that sensor measurement acquisition may involve

nonlinearities that disrupt the brains ability to linearly transform the color basis down-

stream. Fortunately, experimental results based on single-cell responses and psychophysi-

cal sensitivity suggest that any existing nonlinearities at this level are negligible [41, 57].


55/192

Chapter 3

A New Color Space for Image Processing

In this chapter, we motivate the need for a new color space primarily by the desire to

perform illumination-invariant image processing, in which algorithm outputs are not so

sensitive to the illumination conditions of their input images. Simultaneously, we also

seek a color space in which perceptual distances can be computed easily. We show that

these desires relate very naturally to notions in perceptual science, andwith one additional

assumptionfully lock down the form of the color space parameterization. We fit the re-

maining parameters to experimental data and apply our new color space to some examples

to illustrate its utility.

44


56/192

Chapter 3: A New Color Space for Image Processing 45

3.1 Introduction

While existing color spaces address a range of needs, none of them simultaneously capture

two notable properties required by a large class of applications that includes segmentation

and Poisson image editing [48]. In this work, we present a new color space designed

specifically to address this deficiency.

We propose the following two color space desiderata for image processing:

(1) difference vectors between color pixels are unchanged by re-illumination;

(2) the2 norm of a difference vector matches the perceptual distance between the two

colors.

The first objective restricts our attention to three-dimensional color space parameteriza-

tions in which color displacements, or gradients, can be computed simply as component-

wise subtractions. Furthermore, it expresses the desire for these color displacementsthe

most common relational quantities between pixels in image processingto be invariant to

changes in the spectrum of the scene illuminant. Illumination invariance is useful for ap-

plications where processing is intended to operate on intrinsic scene properties instead of

intensities observed under one particular illuminant. For example, Figure 3.3 (page 58)

shows that when segmenting an image using usual color spaces, images that differ only in

the scenes illumination during capture can require nontrivial parameter tweaking before

the resulting segmentations are clean and consistent. And even then, the segmentations are

not necessarily reliable. Figure 3.4 (page 61) illustrates the extreme sensitivity a Poisson

image editing algorithm exhibits when the illuminant of the foreground object does not


57/192

Chapter 3: A New Color Space for Image Processing 46

match the backgrounds illumination.

The second condition implies that the standard computational method of measur-

ing error and distance in color space should match the perceptual metric used by human

viewers.

These desiderata have direct correspondence to widely-studied perceptual no-

tions that possess some experimental support. Desideratum (1) corresponds to subtractive

mechanisms in color image processing and to human color constancy. Desideratum (2)

relates to the approximate flatness of perceptual space.

Subtractive mechanisms refer to the notion that humans perform spatial color

comparisons by employing independent processing per channel [33, 43] and that such per-

channel comparisons take a subtractive form. Physiological evidence for subtraction comes

from experiments such as those revealing lateral inhibition in the retina [18] and the exis-

tence of double opponent cells in the visual cortex (where each type provides a spatially

opponent mechanism for comparing a select chromatic channel) [46].

Color constancy is described in Section 1.1.1 and analyzed in detail in Chapter 2.

Sometimes the term chromatic adaptation is used instead to emphasize the inability to

achieve perfect constancy [18].

The approximate flatness of perceptual space refers to the relative empirical

successas

Ham Thesis

Documents

Transcript of Ham Thesis