Ham Thesis
-
Upload
panchal-abhishek-jagdishchandra -
Category
Documents
-
view
219 -
download
0
Transcript of Ham Thesis
-
8/11/2019 Ham Thesis
1/192
Geometric Methods in Perceptual Image Processing
A dissertation presented
by
Hamilton Yu-Ik Chong
to
The School of Engineering and Applied Sciences
in partial fulfillment of the requirements
for the degree of
Doctor of Philosophy
in the subject of
Computer Science
Harvard University
Cambridge, Massachusetts
May 2008
-
8/11/2019 Ham Thesis
2/192
c2008 - Hamilton Yu-Ik Chong
All rights reserved.
-
8/11/2019 Ham Thesis
3/192
-
8/11/2019 Ham Thesis
4/192
Abstract iv
derstanding. We single out suggestive contours and illumination valleys as particularly
interesting because although one is defined in terms of three-dimensional geometry and the
other in terms of image features, the two produce strikingly similar results (and effectively
convey a sense of shape). This suggests that the two types of curves capture similar pieces
of geometric information. To explore this connection, we develop some general techniques
for recasting questions about the image as questions about the surface.
-
8/11/2019 Ham Thesis
5/192
Contents
Title Page . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . i
Abstract . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . iii
Table of Contents . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . v
Citations to Previously Published Work . . . . . . . . . . . . . . . . . . . . . . viii
Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ix
Dedication . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xi
1 Introduction 1
1.1 Perceptual Image Processing . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.1.1 Color Constancy . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.1.2 A Perception-based Color Space . . . . . . . . . . . . . . . . . . . 7
1.1.3 Shape Perception and Line Drawings . . . . . . . . . . . . . . . . 9
1.2 Contributions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 111.3 Overview of Thesis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
2 A Basis for Color Constancy 16
2.1 Introduction and Previous Work . . . . . . . . . . . . . . . . . . . . . . . 16
2.2 Theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
2.2.1 Measurement Constraints . . . . . . . . . . . . . . . . . . . . . . . 22
2.3 Color Basis for Color Constancy . . . . . . . . . . . . . . . . . . . . . . . 25
2.4 Relationship to Previous Characterizations . . . . . . . . . . . . . . . . . . 27
2.5 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30
2.5.1 Effective Rank of the World . . . . . . . . . . . . . . . . . . . . 30
2.5.2 Von Kries Sensors . . . . . . . . . . . . . . . . . . . . . . . . . . 312.5.3 White Patch Normalization . . . . . . . . . . . . . . . . . . . . . . 33
2.5.4 White Balancing . . . . . . . . . . . . . . . . . . . . . . . . . . . 38
2.5.5 Constrained Optimization . . . . . . . . . . . . . . . . . . . . . . 40
2.6 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43
3 A New Color Space for Image Processing 44
3.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45
v
-
8/11/2019 Ham Thesis
6/192
Contents vi
3.2 Previous work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48
3.3 A Perceptual Metric . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50
3.3.1 Formalizing the Color Space Conditions . . . . . . . . . . . . . . . 50
3.3.2 Form of the Metric . . . . . . . . . . . . . . . . . . . . . . . . . . 51
3.3.3 Parameter Estimation . . . . . . . . . . . . . . . . . . . . . . . . . 52
3.4 Metric Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56
3.5 Applications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57
3.5.1 Segmentation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58
3.5.2 Poisson Image Editing . . . . . . . . . . . . . . . . . . . . . . . . 60
3.6 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65
4 Differential Geometry 67
4.1 Manifolds, Tensors, and Calculus . . . . . . . . . . . . . . . . . . . . . . . 68
4.1.1 Manifolds . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 694.1.2 Tensor Fields . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70
4.1.3 Integrals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79
4.1.4 Derivatives . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 83
4.2 Intrinsic and Extrinsic Properties of Surfaces . . . . . . . . . . . . . . . . 89
4.2.1 First Fundamental Form . . . . . . . . . . . . . . . . . . . . . . . 89
4.2.2 Second Fundamental Form . . . . . . . . . . . . . . . . . . . . . . 90
4.2.3 Tensors that live on the surface itself . . . . . . . . . . . . . . . . . 92
4.2.4 Higher Order Derivatives . . . . . . . . . . . . . . . . . . . . . . . 94
4.3 The Metric Equivalence Problem . . . . . . . . . . . . . . . . . . . . . . . 96
5 Shapes from Curves 985.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 99
5.2 Curve Definitions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 101
5.2.1 Surface-only Curves . . . . . . . . . . . . . . . . . . . . . . . . . 102
5.2.2 Environment-dependent Curves . . . . . . . . . . . . . . . . . . . 106
5.3 Relations Between Image and Surface Curves . . . . . . . . . . . . . . . . 116
5.3.1 Image Plane and Orthographic Projection . . . . . . . . . . . . . . 116
5.3.2 Critical Points of Illumination . . . . . . . . . . . . . . . . . . . . 124
5.3.3 Saint-Venant and Suggestive Energies . . . . . . . . . . . . . . . . 126
5.3.4 Suggestive Contours and Shading . . . . . . . . . . . . . . . . . . 128
5.4 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 129
6 Conclusions and Future Work 130
Bibliography 133
A Color Constancy Proofs 138
A.1 Conditions for Success . . . . . . . . . . . . . . . . . . . . . . . . . . . . 138
-
8/11/2019 Ham Thesis
7/192
Contents vii
A.1.1 Proof of Proposition 1 . . . . . . . . . . . . . . . . . . . . . . . . 138
A.1.2 Proof of Lemma 1 . . . . . . . . . . . . . . . . . . . . . . . . . . 141
A.2 The Space of Solutions . . . . . . . . . . . . . . . . . . . . . . . . . . . . 145
B Color Space Proofs 147
B.1 Deriving the Functional Form . . . . . . . . . . . . . . . . . . . . . . . . . 147
B.2 Recovering Webers Law . . . . . . . . . . . . . . . . . . . . . . . . . . . 149
C More Differential Geometry 150
C.1 Non-coordinate Bases . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 150
C.2 Structure Equations for Surfaces . . . . . . . . . . . . . . . . . . . . . . . 154
C.3 Theorema Egregium . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 159
D Image and Surface Curves 160
D.1 Basic Facts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 160
D.2 Screen Coordinate Vector Fields . . . . . . . . . . . . . . . . . . . . . . . 166
D.3 Curve Relations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 168
D.4 Principal Directions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 175
D.5 Apparent Curvature . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 176
D.6 General Lighting . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 180
-
8/11/2019 Ham Thesis
8/192
Citations to Previously Published Work
Large portions of Chapter 2 on color constancy have previously appeared in:
The von Kries Hypothesis and a Basis for Color Constancy.
H. Y. Chong, S. J. Gortler, T. Zickler.
InProceedings of ICCV 2007.
The perceptual optimization of a basis for color constancy (Section 2.5.5) and the develop-
ment of a new color space as presented in Chapter 3 have previously appeared in:
A Perception-based Color Space for Illumination-invariant
Image Processing.
H. Y. Chong, S. J. Gortler, T. Zickler.
InProceedings of SIGGRAPH 2008.
viii
-
8/11/2019 Ham Thesis
9/192
Acknowledgments
I would first like to thank my advisor Professor Steven Gortler for his support
over the years (during both undergraduate and graduate days). From the beginning, he
gave me great freedom in choosing topics to pursue and provided excellent guidance on
how to approach any chosen problem. I would also like to thank Professors Todd Zickler,
Fredo Durand, Roger Brockett, and Craig Gotsman, all of whom have served on at least
one of my oral exam committees (and read my various scribbles). They have served as
wonderful sounding boards for my (often outlandish) ideas and thoughts. Thanks also to
Professors Doug DeCarlo and Szymon Rusinkiewicz for very helpful discussions on curves
and surfaces. And thanks as well to Brian Guenter for his mentorship at Microsoft Research
and for giving me the opportunity to broaden my research exposure and interact with the
amazing full-timers, postdocs, and fellow interns there.
Graduate school is of course a sometimes trying experience, and without the mu-
tual support of friendsall going through similar trials and tribulationsthe hurdles would
likely feel quite insurmountable. So special thanks to Michael Kellermann, Yi-Ting Huang,
Jimmy Lin, Daniel DeSousa, Eleanor Hubbard, Leo Nguyen, Michelle Gardner, James
Black, Mark Hempstead, and Camilo Libedinsky, who put up with my antics outside
of lab and joined me for sports and gaming. Many thanks as well to lab-mates Danil
Kirsanov, Guillermo Diez-Canas, Brenda Ng, Loizos Michael, Philip Hendrix, Ece Ka-
mar, Geetika Lakshmanan, Doug Nachand, Yuriy Vasilyev, Fabiano Romiero, Emmanuel
Turquin, Christopher Thorpe, Charles McBrearty, Zak Stone, Kevin Dale, Kalyan Sunkavalli,
Moritz Baecher, Miriah Meyer, Christian Ledergerber, and Forrester Cole, who made the
lab an inviting place. Thanks also to David Harvey, Wei Ho, and Ivan Petrakiev for their
ix
-
8/11/2019 Ham Thesis
10/192
Acknowledgments x
math pointers (and conversations outside of math as well). Thanks also to my friends in
and around Saratoga.
And of course, profuse thanks goes to my family for their constant encourage-
ment (despite their teasing me about the cold California winters). I certainly would not
have gotten far at all without their support. I am reminded of a Chinese adage (popularized
by Yao Ming I believe): How do I thank my family? How does a blade of grass thank
the sun? Well, I do not have an answer to that question, so I will have to leave it at just
Thanks!
Thanks again to everyone (and apologies to the many other deserving people Ive
left unmentioned)!
-
8/11/2019 Ham Thesis
11/192
Dedicated to my parents Fu-Chiung and Kathleen Chong,
to my brothers Sanders and Anthony Chong,
and to all teachers.
xi
-
8/11/2019 Ham Thesis
12/192
Chapter 1
Introduction
1.1 Perceptual Image Processing
Perception may roughly be defined as the minds process of disentangling information from
the physical means by which it is conveyed. As such, the study of perception is inherently a
study of abstract representations. In distilling information into abstract quanta, the human
mind prepares such information for conscious processing and consumption. Invisualper-
ception, the inputs are retinal images and the outputs are mental descriptions (e.g., shape
and material properties) of the subject being observed.
On a spectrum of scales at which to probe human perception, the traditional
endeavors of cognitive psychologists and visual system neuroscientists may coarsely be
described as sitting at the two extremes. The former examines perceptual issues on a qual-
itative and macroscopic level while the latter identifies the microscopic biological building
1
-
8/11/2019 Ham Thesis
13/192
Chapter 1: Introduction 2
blocks that allow for physical realization. Computational vision glues these two ends of the
spectrum together by providing an algorithmic description of how the functional building
blocks may give rise to the qualitative behaviors observed and classified through cognitive
experiments. Such a process (i.e., the algorithmic transformation of information) can be
studied independent of any of its physical instantiations, and hence falls within the purview
of computer science.
The aim of this dissertation is to propose new algorithmic models for aspects of
perceptual visual processing. While these models should replicate important features of
human vision, we by no means expect our models to fully describe reality. Rather, they are
simply meant to provide first order quantitative predictions; such predictions can then
be used to either bolster theories or point out assumptions in need of greater refinement.
One of our hopes is that these models might inform future experimental designs and aid
(even if only as a distracting, but somewhat informative, detour) in sciences ongoing and
successive approximations to truth.
Ultimately, however, our primary aim in this work is not to predict actual human
behavior. Our main interest is in the design of new algorithms that assist computers in at-
tacking various problems in graphics and vision. Motivation for this comes from the robust
manner in which the humans cope with environmental variations and successfully com-
plete a variegated set of visual tasks. By constructing new perception-inspired algorithms,
we may be able to endow computers with comparable robustness in accomplishing similar
goals. Therefore, the utility of these models will be judged by their usefulness in algorith-
mic problem solving; their value (as presented here) stands independent of the goodness
of our assumptions on how humans actually process visual information. In this work, we
-
8/11/2019 Ham Thesis
14/192
Chapter 1: Introduction 3
seek progress on three problems: achieving color constancy, designing an illumination-
invariant color space, and relating projected image data to three-dimensional shape under-
standing.
1.1.1 Color Constancy
Color constancy refers to the fact that the perceived color of an object in a scene tends to
remain constant even when the spectral properties of the illuminant (and thus the tristimulus
data recorded at the retina) are changed drastically.
Figure 1.1: Arrows indicate pixel intensity values that have been replicated as uniform
squares of color outside the image context for side-by-side comparison with pixel intensity
values from the other image.
Figure 1.1 illustrates a local version of the effect. The left half of the figure
shows a set of puppets lit by normal daylight. The right half of the figure shows the same
set of puppets, but after an extreme environmental manipulation has been applied. While
no human viewer would consider pairs of spatially corresponding pixels between left and
right halves to be sharing the same color, even under this extreme change, the qualitative
experience of the colors is mostly preserved. For example, if asked to coarsely label the
-
8/11/2019 Ham Thesis
15/192
Chapter 1: Introduction 4
color of the center puppets shirt on the right (marked by one of the arrows), one would
settle upon the descriptor yellow. However, when the pixel RGB values are pulled out
and displayed against a white background, one instead labels the same RGB values as
blue. The surprising fact is that we do not simply see the right half of Figure 1.1 as
consisting only of various shades of blue or purple. We still experience, for example,
sensations of yellow (which is widely considered in vision science the color most opposite
that of blue [46]).
Color constancy is even more powerful when the same environmental manipula-
tion is applied to our entire visual field (i.e., is applied globally rather than locally). In such
cases, we do not have the bright white of the paper (or the original image colors) to deduce
the overall blue-ness of the new environment. For a common, simulated example of such
a global manipulation, consider the eventual color shift that occurs after putting on a pair
of ski goggles. The act of putting on goggles can be thought of as simulating a change
of illuminant to one in which the there is less power in the short wavelengths. At first,
putting on a pair ski goggles with an orange filter makes the world appear orange. How-
ever, over time, the orange-ness disappears and colors are largely perceived as if goggles
had not been worn. When the goggles are later taken off, the world immediately appears
exceedingly blue until color perception once again shifts into an adapted state.
This phenomenon results from some kind of global processing that is still not
completely understood. One working model for achieving constancy is that the human first
somehow estimates the illuminant. The effect of illumination is then undone by applying
a tone mapping operator. This yields colors that are perceived as being lit under some
canonical illuminant. This interpretation may be referred to as illumination discounting
-
8/11/2019 Ham Thesis
16/192
Chapter 1: Introduction 5
[18]. Another working model is that the human processes retinal images by relating colors
of observed objects. In converting retinal signals into perceived responses, the brain only
uses relationships that are not affected by illumination changes. Such an interpretation may
be referred to asspatial processing[18,43].
Whatever the mechanism, this perceptual processing allows color to be treated as
an intrinsic material property. From our day-to-day perceptual experiences, we accept it as
sensible to refer to an apple as red, or an orange as orange. This, however, belies the fact
that color constancy is actually hard to achieve. Recall that the light seen reflected from
objects depends on the full spectrum of the illuminant and the per-wavelength attenuation
due to the material. Light is only turned into three sensory responses once it hits the eye.
Therefore, given scenes with arbitrary illuminant spectra and material reflectances, there is
no reason that constancy should be even close to possible. Consider the scenario in which
two different materials appear the same under one illuminant (i.e., they aremetamers), but
look different under a second illuminant. In this case, one color maps to two distinct colors
with illuminant change, so color relations clearly do not remain constant.
Despite the manifest difficulty in achieving color constancy, models of color con-
stancy are often furthermore forced to obey an even more stringent constraint, the gener-
alized von Kries hypothesis [21]. This hypothesis stipulates that color correction is done
using independent scalings of the three tristimulus values. Von Kries commonly adopted
model assumes that the effect of changing illumination is to multiply each of our trichro-
matic sensor responses by a separate (possibly spatially varying) scale factor, and the brain
is thus able to discount the change by applying the inverse scale factor [18, 33, 46]. The
generalizedvon Kries model permits a change of color basis to take place before the gain
-
8/11/2019 Ham Thesis
17/192
Chapter 1: Introduction 6
factors are applied.
With such demanding restrictions, color constancy may seem a vain endeavor. In-
deed, humans do not have perfect color constancy; however, our color percepts are nonethe-
less surprisingly stable [46]. This suggests that the worlds illuminants and reflectances
are not completely arbitrary. One natural question is then, What are the necessary and
sufficient conditions under which constancy is in fact achievable? Answers to this are
discussed in Chapter 2. There, we will focus mostly on the generalized von Kries model. A
related question is, To what extent are these conditions true in our everyday world? We
will present experimental tests of this as well. Note that these questions concern the gen-
eralpossibilityof achieving color constancy with any von Kries-like approach. Therefore,
answers to these questions provide blanket statements that upper bound what is attainable
for any method employing per-channel gain control for color constancy (e.g., grey world,
grey edge [54], gamut mapping [24], retinex [33]). They do not address what algorithm in
particular best achieves color constancy under these conditions.
Given that the necessary conditions for generalized von Kries models to achieve
perfect color constancy may not be met exactly, we would also like to compute an opti-
mal basis in which to run von Kries-based color constancy algorithms. The problem of
computing such a basis is addressed in Section 2.3. As a follow-up question we also pose
the following: To what extent is there evidence that humans process under the basis we
compute? This question is left unanswered in this thesis.
With regards to applications, models for color constancy can be applied to color
correction problems such as white balancing. In white balancing, a scene is captured under
-
8/11/2019 Ham Thesis
18/192
Chapter 1: Introduction 7
one ambient illuminant. The image is then later displayed, but under viewing conditions
that employ a different ambient illuminant. The mismatch in illuminants during capture and
display, and hence mismatch in observer adaptations in both cases, causes discrepancies
in the perceived colors of the displayed reproduction. A white balanced image is one in
which the captured colors are mapped to colors as seen under the new display illuminant.
By displaying the white balanced image, a viewer perceives the reproduced scene in the
same way the one who captured the image perceived the original scene.
1.1.2 A Perception-based Color Space
Given the importance of color processing in computer graphics, color spaces abound.
While existing color spaces address a range of needs, they all suffer from one notable
problem that makes them unsatisfactory for a large class of applications including segmen-
tation and Poisson image editing [48]: algorithms working in these color spaces exhibit
great sensitivity to the illumination conditions under which their inputs were captured.
Consider Figure 1.2 in which an active contour segmentation algorithm is run
on two images that differ only in the scenes illumination (these were generated using full-
spectral images and measured illuminant spectra). Figure 1.2a shows the two initial images.
Figure 1.2b shows their segmentations in a typical color space (the space of RGB values
provided by the image). Figure 1.2c shows their segmentations in another common color
space (CIELab). In both color spaces, the segmentation parameters were tuned on each
image to produce as clean and consistent segmentations as possible. Furthermore, even
after parameter tweaking, the segmentations do not appear particularly reliable. It would
-
8/11/2019 Ham Thesis
19/192
Chapter 1: Introduction 8
(a) Pair of images to segment. (b) One typical color space. (c) Another color space.
Figure 1.2: Active contour segmentation of two images in two different color spaces with
parameters tweaked for each image so that segmentations are as clean and consistent as
possible. Optimal parameters for the image pairs are separated by 2 and 5 orders of mag-
nitude respectively.
be beneficial to have a segmentation algorithm that could use the same set of parameters to
produce consistent segmentations of the two images in spite of their differing illuminations.
Figure 1.3 shows another example, this time of Poisson image editing. Here, a
picture of a bear is cut out from one image (Figure 1.3a) and inserted into a background
image of a swimming pool (Figure 1.3b). Figure 1.3c shows the result when run in a usual
color space (the space of RGB values provided by the image). In this image, the foreground
object is seamlessly inserted into the background image; however, the bear appears rather
faded and ghostly. For comparison, Figure 1.3d shows an insertion performed using the
color space we present in Chapter 3. Here the bear appears much more corporeal.
These examples point to the fact that algorithms working in current color spaces
lack an essential robustness to the illumination conditions of their inputs. If processing
is meant to operate on intrinsic scene properties, then such illumination-dependent results
-
8/11/2019 Ham Thesis
20/192
Chapter 1: Introduction 9
(a) Object and mask. (b) Background. (c) Typical result. (d) More desirable result.
Figure 1.3: Poisson image editing.
are unsatisfactory. Since color constancy deals with taking measured colors and turning
them into stable percepts, addressing this problem is plausibly related to issues of color
constancy. Given an algorithm for color constancy, one solution might be to map an images
input colors to some canonical set of colors (i.e., colors as seen under a standard illuminant)
and then work in such a space instead. One potential problem with such an approach is that
it may depend on the ability of the algorithm to perform difficult tasks like estimating the
illuminant in order to map colors to their canonical counterparts. An even better scenario
would be if some ability to edit colors in an intrinsic sense were built into the choice of
color space itselfwithout requiring the adoption of a specific color correction algorithm.
1.1.3 Shape Perception and Line Drawings
Shape perception refers to the process of taking image data and inferring the three-dimensional
geometry of objects within the image. While there are many approaches to estimating shape
(e.g., shape from shading [29], shape from texture [5], shape from specular flow [2]), we
shall be interested in the connection between shape understanding and curves in an image.
-
8/11/2019 Ham Thesis
21/192
Chapter 1: Introduction 10
(a) Silhouettes and suggestive contours. (b) Silhouettes only.
Figure 1.4: Line renderings of an elephant.
As shown in Figure 1.4a, line drawings alone are often enough to convey a sense of shape.
This suggests that curves themselves provide a lot of information about a surface. Artists
are able to make use of this fact and invert the process; they take a reference object, either
observed or imagined, and summarize it as a line drawing that is almost unambiguously
interpretable. Two natural questions then emerge: (1) Given a three-dimensional object,
what curves should one draw to best convey the shape? (2) Given an image, is there a
stable set of curves that humans can robustly detect and use to infer shapeand if so, how
are they defined and what precisely is their information content?
Analyses of the more classically defined curves on surfaces (e.g., silhouettes,
ridges, parabolic lines) are inadequate in giving a precise understanding to this problem.
For example, Figure 1.4b shows that the relationships between curves and shape perception
involve much more than simply silhouette data. Likewise can be said of the traditionally
defined ridges and valleys. Researchers have more recently defined curves such as sugges-
tive contours [15] and apparent ridges [32] which take advantage of view information and
produce informative renderings, but more precise analyses of when and why they succeed
-
8/11/2019 Ham Thesis
22/192
Chapter 1: Introduction 11
are still lacking.
A complete understanding of the set of curves humans latch onto clearly ex-
tends far beyond the scope of this work; we instead focus on laying some groundwork for
future investigations. Our guiding philosophy is roughly as follows: humans only have
retinal images as inputs; assuming curve-based shape understanding is a learned process,
any meaningful curves should then be detectable (in a stable manner) in retinal images;
therefore, analysis should be developed for relating curves defined in images to geometric
properties of the viewed surface. Philosophy aside, the same geometric techniques devel-
oped for studying such relations may also be applied to answering other sorts of questions
relating curves on surfaces to curves in images, so we hope these techniques may prove
useful more generally.
1.2 Contributions
Color Constancy. Color constancy is almost exclusively modeled with von Kries (i.e.,
diagonal) transforms [21]. However, the choice of basis under which von Kries transforms
are taken is traditionallyad hoc. Attempts to remedy the situation have been hindered by
the fact that no joint characterization of the conditions for worlds 1 to support (generalized)
von Kries color constancy has previously been achieved. In Chapter 2, we establish the
following:
1By a world, we mean a collection of illuminants, reflectances, and visual sensors. These are typically
described by a database of the spectral distributions for the illuminants, per-wavelength reflections for the
materials, and per-wavelength sensitivities for the sensors.
-
8/11/2019 Ham Thesis
23/192
Chapter 1: Introduction 12
Necessary and sufficient conditions for a world to support doubly linear color con-
stancy.
Necessary and sufficient conditions for a world to support generalized von Kries
color constancy.
An algorithm for computing a locally optimal color basis for generalized von Kries
color constancy.
In the applications we discuss, we are mostly concerned with the cone sensors of the human
visual system. However, the same theory applies for camera sensors and can be generalized
in a straightforward manner for a different number of sensors.
A Perception-based Color Space. In Chapter 3, we design a new color space with the
following properties:
It has a simple 3D parameterization. Euclidean distances approximately match perceptual distances.
Color displacements and gradients are approximately preserved under changes in the
spectrum of the illuminant.
Coordinate axes have an interpretation in terms of color opponent channels.
It can easily be integrated with fibered space and measure-theoretic image models.
The first three bullet points imply the color space can easily be plugged into many image
processing algorithms (i.e., algorithms can be called as black-box functions and require no
change in implementation). This color space also enables these algorithms to work on the
-
8/11/2019 Ham Thesis
24/192
Chapter 1: Introduction 13
intinsic image and thereby yield (approximately) illumination-invariant results.
Shape Perception and Line Drawings. To study shape, we present some technical tools
in the context of relating Saint-Venant curves to suggestive contours. Contributions include
the following:
Choice of bases for surface and image that simplifies calculations and makes expres-
sions more conducive to interpretation.
Transplanting of techniques for working with bases that change at every point andnotation for reducing ambiguities.
General procedure for expressing image information in terms of three-dimensional
geometric quantities.
Some relations between suggestive contours, Saint-Venant curves, and surface ge-
ometry.
Relation between a surface-curves normal curvature in the tangent direction and its
apparent curvature in the image.
1.3 Overview of Thesis
The organization for the rest of the thesis is as follows:
Chapter 2. We analyze the conditions under which various models for color constancy can
work. We focus in particular on the generalized von Kries model. We observe that the von
-
8/11/2019 Ham Thesis
25/192
Chapter 1: Introduction 14
Kries compatibility conditions are impositions only on the sensor measurements, not the
physical spectra. This allows us to formulate the conditions succinctly as rank constraints
on an order three measurement tensor. Given this, we propose an algorithm that computes
a (locally) optimal choice of color basis for von Kries color constancy and compare the
results against other proposed choices.
Chapter 3. Motivated by perceptual principlesin particular, color constancywe derive a
new color space in which the associated metric approximates perceived distances and color
displacements capture relationships that are robust to spectral changes in illumination. The
resulting color space can be used with existing image processing algorithms with little
or no change to the methods. We show application to segmentation and Poisson image
processing.
Chapter 4. This chapter presents the mathematical background for Chapter 5 (where we
consider the relations between shapes and image curves). The previous chapters make only
minor references to this chapter, so readers only interested in the color processing portions
of this work can simply refer to this chapter as required. Chapter 2 makes use of multilinear
algebra, so reading the notation used for denoting tensors is sufficient. Chapter 3 makes
reference to the metric equivalence problem discussed in Section 4.3. The more involved
details are not so important, so a high level overview suffices. Chapter 5 makes full use
of the presented differential geometry. So readers interested in that chapter should read
Chapter 4 with more care.
Chapter 5. We use the formalism presented in Chapter 4 to relate curves detectable in
images to information about the geometry. We focus in particular on Saint-Venant valleys
-
8/11/2019 Ham Thesis
26/192
Chapter 1: Introduction 15
and relate them to suggestive contours. We develop techniques for relating image and
surface information and apply these to characterizing curve behavior at critical points of
illumination. We also prove more limited results away from critical points of illumination.
Appendix A. This appendix provides proofs for the results cited in Chapter 2. It also
contains an analysis of the structure of the space of worlds supporting generalized von
Kries color constancy.
Appendix B. This appendix presents the proof that locks down the functional form of
our color space parameterization. We also prove that our model recovers the well-known
Webers Law for brightness perception.
Appendix C. This appendix contains more detailed discussions on differential geometry
that are somewhat tangential to the main exposition. It includes some calculations that can
be used to verify some of the claims or get a better sense for how the formalism can be
used.
Appendix D. This appendix provides calculations and proofs for the various relations we
discuss in Chapter 5. It also details some further derivations that may be useful for future
work.
-
8/11/2019 Ham Thesis
27/192
Chapter 2
A Basis for Color Constancy
In this chapter we investigate models for achieving color constancy.We devote particular
attention to the ubiquitous generalized von Kries model. In Section 2.2.1, we relate the
ability to attain perfect color constancy under such a model to the rank of a particular order
three tensor. For cases in which perfect color constancy is not possible, this relationship
suggests a strategy for computing an optimal color basis in which to run von Kries based
algorithms.
2.1 Introduction and Previous Work
For a given scene, the human visual system, post adaptation, will settle on the same per-
ceived color for an object despite spectral changes in illumination. Such an ability to dis-
cern illumination-invariant material descriptors has clear evolutionary advantages and also
16
-
8/11/2019 Ham Thesis
28/192
Chapter 2: A Basis for Color Constancy 17
largely simplifies (and hence is widely assumed in) a variety of computer vision algorithms.
To achieve color constancy, one must discount the effect of spectral changes in
the illumination through transformations of an observers trichromatic sensor response val-
ues. While many illumination-induced transformations are possible, it is commonly as-
sumed that each of the three sensors reacts with a form of independent gain control (i.e.,
each sensor response value is simply scaled by a multiplicative factor), where the gain fac-
tors depend only on the illumination change [21, 33]. This is termed von Kries adaptation.
Represented in linear algebra, it is equivalent to multiplying each column vector of sensor
response values by ashareddiagonal matrix (assuming spatially uniform illumination), and
is therefore also referred to as the diagonal model for color constancy.
Note that while the initial von Kries hypothesis applied only to direct multiplica-
tive adjustments of retinal cone sensors, we follow [21] and use the term more loosely to
allow for general trichromatic sensors. We also allow for a change of color basis to occur
before the per-channel multiplicative adjustment. (Finlayson et al. [21] refer to this as a
generalized diagonalmodel for color constancy, and they term the change of color basis a
sharpening transform.)
The (generalized) diagonal model is at the core of the majority of color constancy
algorithms. Even a number of algorithms not obviously reliant on the diagonal assumption
in fact rely on diagonal models following a change of color basis [21, 22]; their choice of
color basis is simply not explicit. Yet, despite the widespread use of the diagonal model,
good choices of color bases under which diagonal transforms can be taken are only partially
understood.
-
8/11/2019 Ham Thesis
29/192
-
8/11/2019 Ham Thesis
30/192
Chapter 2: A Basis for Color Constancy 19
While these limitations have been well-documented, a more complete character-
ization of the conditions for von Kries compatibility has yet to be established. As a result,
the development of more powerful systems for choosing optimized color bases has been
slow. This chapter addresses these issues by answering the following questions:
(1) What are the necessary and sufficient conditions that sensors, illuminants, and mate-
rials must satisfy to be exactly von Kries compatible, and what is the structure of the
solution space?
(2) Given measured spectra or labeled color observations, how do we determine the color
space that best supports diagonal color constancy?
We observe that the joint conditions are impositions only on the sensor measure-
ments, not the physical spectra. This allows the von Kries compatibility conditions to be
succinctly formulated as rank constraints on an order 3 measurement tensor. Our analysis
leads directly to an algorithm that, given labeled color data, computes a locally optimal
choice of color basis in which to carry out diagonal color constancy computations. The
proposed framework also unifies most existing analyses of von Kries compatibility.
2.2 Theory
We define two notions of color constancy. The first definition captures the idea that a
single adjustment to the sensors will map all material colors seen under an illuminantE1to
reference colors under (a possibly chosen standard) illuminant E2. The second definition
(also known asrelationalcolor constancy) captures the idea that surface colors have a fixed
-
8/11/2019 Ham Thesis
31/192
Chapter 2: A Basis for Color Constancy 20
relationship between each other no matter what overall illumination lights the scene. As
stated, these two definitions are not interchangeable. One being true does not imply the
other.
To define the issues formally, we need a bit of notation. Let Rbe the smallest
(closed) linear subspace ofL2 functions enclosing the spectral space of materials of interest.
Let E be the smallest (closed) linear subspace ofL2 functions enclosing the spectral space
of illuminants of interest. LetpR,E be the color (in the sensor basis) of material reflectance
R() R under illuminationE() E. In the following,D and Dare operators that takecolor vectors and map them to color vectors. D is required to be independent of the material
R; likewise, Dis required to be independent of the illuminant E. Thedenotes the action
of these operators on color vectors.
(1) Color constancy:
For allE1,E2E, there exists a D(E1,E2)such that for allR
R,
pR,E2 =D(E1,E2)pR,E1
(2) Relational color constancy:
For allR1,R2 R, there exists a D(R1,R2)such that for allE E,
pR2,E = D(R1,R2)pR1,E
In the case that D and D are linear (and hence identified with matrices),
is just matrix-
vector multiplication. IfD is linear, we say that the world supports linear adaptive color
constancy. If D is linear, we say the world supports linear relational color constancy. D
being linear does not imply Dis linear, and vice versa. If both D and Dare linear, we say
the world supportsdoubly linearcolor constancy.
-
8/11/2019 Ham Thesis
32/192
Chapter 2: A Basis for Color Constancy 21
(1)
(2)(3) ...
(1)
(2)(3)...
(1)
(2)
(3)
ji
k
ji
k
ji
k
Figure 2.1: The 3xIxJ measurement tensor. The tensor can be sliced in three ways to
produce the matrices (j), (i), and (k).
In particular, we shall be interested in the case when D and D are both further-
morediagonal(under some choice of color basis). It is proven in [21] that for a fixed color
space, D is diagonal if and only if D is diagonal. So the two notions of color constancy
are equivalent if eitherD or D is diagonal, and we say the world supports diagonal color
constancy (thedoubly modifier is unnecessary). The equivalence is nice because we, as
biological organisms, can likely learn to achieve definition 1, but seek to achieve definition
2 for inference.
Given a set of illuminants {Ei}i=1,...,I, reflectances {Rj}j=1,...,J, and sensor color
matching functions {k}k=1,2,3, we define a measurement data tensor2 (see Figure 2.1):
Mki j:=
k()Ei()Rj()d (2.1)
For fixed values of j, we get 3xI matrices (j) := Mk
i j that map illuminants
expressed in the {Ei}i=1,...,Ibasis to color vectors expressed in the sensor basis. Likewise,
for fixed values ofi, we get 3xJmatrices (i):=Mk
i j that map surface reflectance spectra
expressed in the {Rj}j=1,...,Jbasis to color vectors. We can also slice the tensor by constant2We will use latin indices starting alphabetically with i to denote tensor components instead of the usual
greek letters to simplify notation and avoid confusion with other greek letters floating around.
-
8/11/2019 Ham Thesis
33/192
Chapter 2: A Basis for Color Constancy 22
0
0
0
Figure 2.2: Core tensor form: a 3x3x3 core tensor is padded with zeros. The core tensor is
not unique.
kto getIxJmatrices (k) :=Mki j.
Since color perception can depend only on the eyes trichromatic color measure-
ments, worlds (i.e., sets of illuminant and material spectra) giving rise to the same measure-
ment tensor are perceptually equivalent. To understand diagonal color constancy, therefore,
it is sufficient to analyze the space of measurement tensors and the constraints that these
tensors must satisfy. This analysis of von Kries compatible measurement tensors is covered
in section 2.2.1.
Given a von Kries compatible measurement tensor (e.g., an output from the al-
gorithm in section 2.3), one may also be interested in the constraints such a tensor places
on the possible spectral worlds. This analysis is covered in section 2.4.
2.2.1 Measurement Constraints
The discussion in this section will always assume generic configurations (e.g., color mea-
surements span three dimensions, color bases are invertible). Proofs not essential to the
main exposition are relegated to Appendix A.
Proposition 1. A measurement tensor supports doubly linear color constancy if and only
-
8/11/2019 Ham Thesis
34/192
Chapter 2: A Basis for Color Constancy 23
if there exists a change of basis for illuminants and materials that reduces it to the core
tensor form of Figure 2.2.
More specifically (as is apparent from the proof of Proposition 1 in Appendix A.1.1), if
a single change of illuminant basis makes all the (j) slices null past the third column,
the measurement tensor supports linear relational color constancy. Likewise, a change of
material basis making all the (i)slices null past the third column implies the measurement
tensor supports linear adaptive color constancy. Support for one form of linear constancy
does not imply support for the other.
The following lemma provides a stepping stone to our main theoretical result and
is related to some existing von Kries compatibility results (see section 2.4).
Lemma 1. A measurement tensor supports generalized diagonal color constancy if and
only if there exists a change of color basis such that, for all k, (k) is a rank-1 matrix.
This leads to our main theorem characterizing the space of measurement tensors
supporting generalized diagonal color constancy.
Theorem 1. A measurement tensor supports generalized diagonal color constancy if and
only if it is a rank 3 tensor.3
An order 3 tensor (3D data block) T is rankN ifNis the smallest integer such
that there exist vectors4
{an,bn,cn
}n=1,...,N allowing decomposition as the sum of outer
products (denoted by ):3There exist measurement tensors supporting generalized diagonal color constancy with rank less than 3,
but such examples are not generic.4In the language of differential geometry introduced in Chapter 4,{an,bn}n=1,...,N are really covector
component lists.{cn}n=1,...,Nare really vector component lists.
-
8/11/2019 Ham Thesis
35/192
Chapter 2: A Basis for Color Constancy 24
T=N
n=1
cnanbn. (2.2)
Without loss of generality, let{an} be vectors of length I, corresponding to the
illuminant axis of the measurement tensor; let{bn} be vectors of length J, corresponding
to the material axis of the tensor; and let {cn} be vectors of length 3, corresponding to the
color sensor axis of the tensor. Let the vectors {an} make up the columns of the matrix A,
vectors
{bn}
make up the columns of the matrix B, and vectors
{cn}
make up the columns
of the matrixC. Then the decomposition above may be restated as a decomposition into
the matrices (A,B,C), each withNcolumns.
Proof. (Theorem 1). First suppose the measurement tensor supports generalized diagonal
color constancy. Then by Lemma 1, there exists a color basis under which each (k) is rank-
1 (as a matrix). This means each (k) can be written as an outer product, (k) =akbk.
In this color basis then, the measurement tensor is a rank 3 tensor in which the matrixC
(following notation above) is just the identity.5 We also point out that an invertible change
of basis (on any ofA,B,C) does not affect the rank of a tensor, so the original tensor (before
the initial color basis change) was also rank 3.
For the converse case, we now suppose the measurement tensor is rank 3. Since
Cis (in the generic setting) invertible, multi-linearity gives us
C1
3
n=1
cn an bn
=3
n=1
C1cn
an bn. (2.3)5Technically, this shows that the tensor is at most rank 3, but since we are working with generic tensors,
we can safely discard the degenerate cases in which observed colors do not span the three-dimensional color
space.
-
8/11/2019 Ham Thesis
36/192
Chapter 2: A Basis for Color Constancy 25
The operator on the left hand side of Equation (2.3) denotes the application of the 3 3
matrixC1 along the sensor axis of the tensor. The right hand side of Equation (2.3) is
a rank 3 tensor with each (k) slice a rank-1 matrix. By Lemma 1, the tensor must then
support diagonal color constancy.
In the proof above, note that the columns ofC exactly represent the desired color basis
under which we get perfect diagonal color constancy. This theorem is of algorithmic im-
portance because it ties the von Kries compatibility criteria to quantities (best rank 3 tensor
approximations) that are computable via existing multilinear methods.
2.3 Color Basis for Color Constancy
Given a measurement tensorMgenerated from real-world data, we would like to find the
optimal basis in which to perform diagonal color constancy computations. To do this,
we first find the closest von Kries compatible measurement tensor (with respect to the
Frobenius norm). We then return the color basis that yields perfect color constancy under
this approximate tensor.
By Theorem 1, finding the closest von Kries compatible measurement tensor is
equivalent to finding the best rank 3 approximation. Any rank 3 tensor may be written in
the form of equation (2.2) withN=3. It also turns out that such a decomposition of a rank
three tensor into these outer-product vectors is almost always unique (modulo permutations
and scalings). We solve forMs best rank 3 approximation (decomposition into A, B, C)
via Trilinear Alternating Least Squares (TALS) [28]. For a rank 3 tensor, TALS forcesA,
-
8/11/2019 Ham Thesis
37/192
Chapter 2: A Basis for Color Constancy 26
B, and Cto each have 3 columns. It then iteratively fixes two of the matrices and solves for
the third in a least squares sense.
Repeating these computations in lockstep guarantees convergence to a local min-
imum. A, B,Ccan be used to reconstruct the closest von Kries compatible tensor and the
columns ofCexactly represent the desired color basis.
As a side note, the output of this procedure differs from the best rank-(3,3,3)
approximation given by HOSVD [37]. HOSVD only gives orthogonal bases as output and
the rank-(3,3,3) truncation does not in general yield a closest rank 3 tensor. HOSVD may,
however, provide a good initial guess.
The following details on TALS mimic the discussion in [52]. For further infor-
mation, see [28, 52] and the references therein. The Khatri-Rao product of two matricesA
andB withNcolumns each is given by
AB:=a1b1,a2b2, ,aNbN
, (2.4)
where is the Kronecker product.
Denote the flattening of the measurement tensorMby MIJ3 if the elements ofM
are unrolled such that the rows of matrix MIJ3 loop over the(i,j)-indices withi=1,...,I
as the outer loop and j= 1,...,Jas the inner loop. The column index of MIJ3 corresponds
with the dimension of the measurement tensor that is not unrolled (in this casek=1,2,3).
-
8/11/2019 Ham Thesis
38/192
Chapter 2: A Basis for Color Constancy 27
The notation for other flattenings is defined symmetrically. We can then write
MJI3 = (BA)CT. (2.5)
By symmetry of equation (2.5), we can write out the least squares solutions for each of the
matrices (with the other two fixed):
A = (BC) MJ3IT , (2.6)
B =
(CA) M3IJT
, (2.7)
C =
(BA) MJI3T
. (2.8)
2.4 Relationship to Previous Characterizations
As mentioned in the introduction, there are two main sets of theoretical results. There are
the works of [20,58] that give necessary and sufficient conditions for von Kries compatibil-
ity under a predetermined choice of color space, and are able to build infinite dimensional
von Kries compatible worlds for this choice. Then there are the works of [21, 22] that
prescribe a method for choosing the color space, but only for worlds with low dimen-
sional linear spaces of illuminants and materials. We omit direct comparison to the various
spectral sharpening techniques [6, 16, 23] in this section, as these methods propose more
intuitive guidelines rather than formal relationships.
Previous analyses treat the von Kries compatibility conditions as constraints on
-
8/11/2019 Ham Thesis
39/192
Chapter 2: A Basis for Color Constancy 28
spectra, whereas the analysis here treats them as constraints on color measurements. In this
section, we translate between the two perspectives. To go from spectra to measurement
tensors is straightforward. To go the other way is a bit more tricky. In particular, given a
measurement tensor with rank-1 (k), there is not a unique world generating this data. Any
set of illuminants{Ei}i=1,...,I and reflectances{Rj}j=1,...,J satisfying Equation (2.1) (with
M and k fixed) will be consistent with the data. Many constructions of worlds are thus
possible. But if one first selects particular illuminant or material spectra as mandatory in-
clusions in the world, then one can state more specific conditions on the remaining spectral
choices. For more on this, see Appendix A.2.
In [21, 22], it is shown that if the illuminant space is 3 dimensional and the ma-
terial space is 2 dimensional (or vice versa), then the resulting world is (generalized) von
Kries compatible. As a measurement tensor, this translates into stating that any 3x3x2 mea-
surement tensor is (complex) rank 3. However this 3-2 condition is clearly not necessary
as almost every rank 3 tensor is not reducible via change of bases to size 3x3x2. In fact, one
can always extend a 3x3x2 tensor to a 3x3x3 core tensor such that the (k) are still rank-1.
The illuminant added by this extension is neither black with respect to the materials, nor in
the linear span of the first two illuminants.
The necessary and sufficient conditions provided in [58] can be seen as special
cases of Lemma 1. The focus on spectra leads to a case-by-case analysis with arbitrary
spectral preferences. However, the essential property these conditions point to is that the
2x2 minors of(k) must be zero (i.e., (k) must be rank-1).
One case from [58] is explained in detail in [20]. They fix a color space, a space
-
8/11/2019 Ham Thesis
40/192
Chapter 2: A Basis for Color Constancy 29
0
0
0
0
0
0
**
*
0
0
0
0
0
0
**
*
0
0
0
0
0
0
**
*
0
0
0
0
0
0
**
*
j
i
k
Figure 2.3: The rows of a single (1)slice are placed into a new measurement tensor (rows
are laid horizontally above) with all other entries set to zero. The marks the nonzeroentries.
of material spectra, and a single reference illumination spectrum. They can then solve for
the unique space of illumination spectra that includes the reference illuminant and is von
Kries compatible (in the fixed color basis) with the given material space.
In our framework, this can be interpreted as follows. The given input gives rise to
a single (1) measurement slice. The three rows of this slice can be pulled out and placed
in a new measurement tensor of the form shown in Figure 2.3. This measurement tensor is
then padded with an infinite number of zero (i)matrices. The (k) slices of this new tensor
are clearly rank-1 matrices, and thus this tensor is von Kries compatible in the given color
space. Moreover, any measurement tensor with rank-1(k) that include the original (1)
slice in its span must have (i) slices that are spanned by the (i)slices in Figure 2.3. With
this fixed tensor and the fixed material spectra, one can then solve Equation (2.1) to obtain
the space of compatible illumination spectra. This space can be described by three non-
black illuminants and an infinite number of black illuminants (giving zero measurements
for the input material space). Since the original(1) measurement slice is in the span of
the (i)slices, the original reference illuminant must be in the solution space.
-
8/11/2019 Ham Thesis
41/192
Chapter 2: A Basis for Color Constancy 30
2.5 Results
We used the SFU color constancy dataset to create a measurement tensor to use in our ex-
periments. The SFU database provides 8 illuminants simulating daylight, and 1,995 mate-
rials including measured spectra of natural objects. Fluorescent spectra were removed from
the dataset in hopes of better modeling natural lighting conditions since, in color matching
experiments, fluorescent lamps cause unacceptable mismatches of colored materials that
are supposed to match under daylight [60].
Color matching functions were taken to be CIE 1931 2-deg XYZ with Judd 1951
and Vos 1978 modifications [60]. To resolve mismatches in spectral sampling, we inter-
polated data using linear reconstruction. Cone fundamentals were taken to be the Vos and
Walraven (1971) fundamentals [60]. Experiments were run with illuminant spectra nor-
malized with respect to the L2 norm.
We followed the strategy of Section 2.3 to produce optimized color spaces for
von Kries color constancy.
2.5.1 Effective Rank of the World
To test the effective rank of our world (as measured by available datasets), we approxi-
mate the SFU measurement tensor with tensors of varying rank and see where the dropoff
in approximation error occurs. Figure 2.4 shows the results from the experiment. We mea-
sured error in CIEDE2000 units in an effort to match human perceptual error. CIEDE2000
is the latest CIE standard for computing perceptual distance and gives the best fit of any
-
8/11/2019 Ham Thesis
42/192
Chapter 2: A Basis for Color Constancy 31
RMS Error of SFU Tensor Approximation
0
5
10
15
20
25
1 2 3 4
approximating tensor's rank
colorRMSerror(CIEDE2000)
Figure 2.4: The effective rank of the SFU dataset is about 3. The green dot marks the error
for the rank 3 tensor in which the color gamut was constrained to include the entire set of
human-visible colors (see Section 2.5.5).
method to the perceptual distance datasets used by the CIE [40]. As a rule of thumb, 1
CIEDE2000 unit corresponds to about 1 or 2 just noticeable differences.
The red curve in Figure 2.4 shows that the SFU measurement tensor is already
quite well approximated by a rank 3 tensor. The rank 2 tensors approximation error is
68.3% of the rank 1 tensors approximation error. The rank 3 tensors approximation error
is 6.7% of the rank 2 tensors approximation error. The rank 4 tensors approximation error
is 31.3% of the rank 3 tensors approximation error. We discuss the meaning of the green
dot later in Section 2.5.5.
2.5.2 Von Kries Sensors
The matrix mapping XYZ coordinates to the new color coordinates, as computed using the
TALS optimization on the SFU database, is given by the following (where the rows have
-
8/11/2019 Ham Thesis
43/192
Chapter 2: A Basis for Color Constancy 32
ANLS Effective Response
0.0
0.2
0.4
0.6
0.8
1.0
380 430 480 530 580 630 680
wavelength (nm)
normalizedresponse
Cone Response
0.0
0.2
0.4
0.6
0.8
1.0
380 430 480 530 580 630 680
wavelength (nm)
normalizedre
sponse
CIE XYZ Response
0.0
0.2
0.4
0.6
0.8
1.0
380 430 480 530 580 630 680
wavelength (nm)
normalizedre
sponse
TALS Effective Response
-0.5
0.0
0.5
1.0
380 430 480 530 580 630 680
wavelength (nm)
normalizedresponse
Figure 2.5: Color matching functions. Top row shows the standard CIE XYZ and Cone
matching functions. The bottom row shows the effective sensor matching functions result-
ing from applying the Cmatrix derived from optimizations on the SFU database. ANLS
(alternating nonlinear least squares) constrains the color space gamut to contain the human-
visible gamut and measures perceptual error using CIEDE2000.
been normalized to unit length):
C1 =
9.375111101 3.197979101 1.371214101
4.783729101 8.779015101 2.117135102
8.338662102 1.235156101 9.888329101
. (2.9)
We ran our algorithm on the Joensuu database as well [45]. The Joensuu database provides
22 daylight spectra and 219 natural material spectra (mostly flowers and leaves). The re-
sulting basis vectors (columns ofC) were within a couple degrees (in XYZ space) of the
SFU optimized basis vectors. This seems to suggest some amount of stability in the result.
The effective sensor matching functions given by the optimized color basis are
shown in Figure 2.5. As is intuitively suspected, the optimized basis causes a sharpening
-
8/11/2019 Ham Thesis
44/192
Chapter 2: A Basis for Color Constancy 33
in the peaks of the matching functions. This expectation is motivated by the fact that the
diagonal model is exact for disjoint sensor responses. The ANLS result will be discussed
in Section 2.5.5.
2.5.3 White Patch Normalization
There are many experiments one might devise to measure the color constancy afforded by
various color bases. We chose to replicate a procedure commonly used in the literature.
This experiment is based on a white patch normalization algorithm, and is described
below.
Dataset and algorithms
We ran our color basis algorithm on the SFU dataset [7] and compared our resulting color
basis against previous choices (the cone sensor basis, 4 bases derived from different low
dimensional approximations of spectral data, that of Barnard et. al. [6], and the sensor
sharpened basis [23]).
The low dimensional worlds to which we compare are taken to have either 3
dimensional illuminant spaces and 2 dimensional material spaces (a 3-2 world) or vice
versa (a 2-3 world); this allows computing color bases via the procedure in [21].
We took two different approaches to approximating spectra with low dimensional
vector spaces. In the first approach (described in [21, 23]), we run SVD on the illuminant
and material spectra separately. We then save the best rank-3 and rank-2 approximations.
-
8/11/2019 Ham Thesis
45/192
Chapter 2: A Basis for Color Constancy 34
This is Finlaysons perfect sharpening method for databases with multiple lights [23],
and is one of the algorithms that falls under the label of spectral sharpening.
As pointed out in [42], if error is to be measured in sensor space, there are alter-
natives to running PCA on spectra. Given a measurement tensor, the alternative (tensor-
based) approach instead applies SVD on the tensor flattenings MJ3I and M3IJ to get the
principal combination coefficients of the spectral bases (to be solved for) that approximate
the sample spectra. Refer to [42] for details.
We label the algorithm of Barnard et. al. [6] as Barnard. This is a more recent
algorithm and also falls under the label of spectral sharpening.
We also ran experiments against Finlaysons sensor sharpening algorithm. Note
that this algorithm does not actually use the database in determining the sensor transforms.
Its heuristic is simply to transform the sensors so that the responses are as sharp as pos-
sible.
We also tested against a modified version of Finlaysons database sharpening
method [23]. This algorithm as stated is defined in the case when the database has two
illuminants and possibly many materials. Since our database has multiple lights, we used
PCA on the set of illuminant spectra and ran the algorithm using the two most dominant
principal components as our two illuminants. The results were nearly identical to both of
the 2-3 methods, and we therefore omit the corresponding curves from the graphs.
-
8/11/2019 Ham Thesis
46/192
Chapter 2: A Basis for Color Constancy 35
Experimental procedure
We run the same white-patch normalization experiment as in [21]. As input, we are given
a chosen white material Wand an illuminant E. For the SFU database, we used the only
material labeled as white as our white material W. For every other materialR, we compute
a descriptor by dividing each of its 3 observed color coordinates by the 3 color coordinates
of W (the resulting 3 ratios are then transformed as a color vector to XYZ coordinates so
that consistent comparisons can be made with different choices of color space). In a von
Kries world, the descriptor for R would not depend on the illuminant E. To measure the
non von Kries-ness of a world, we can look at how much these descriptors vary with the
choice ofE.
More formally, we define the desriptor as:
d
W,R
E = C
diag
C1
p
W,E1C
1
p
R,E
(2.10)
The functiondiagcreates a matrix whose diagonal elements are the given vectors compo-
nents.Cis a color basis. C,pR,E,pW,E are given in the CIE XYZ coordinate system.
This means we compute the color vectorspW,E andpR,E as:
(pW,E
)k
:=
k
()E()W()d (2.11)
(pR,E)k :=
k()E()R()d (2.12)
with 1() = x() =CIE response function forX, 2() = y() =CIE response function
-
8/11/2019 Ham Thesis
47/192
Chapter 2: A Basis for Color Constancy 36
forY, and 3() = z() = CIE response function for Z. The columns of the matrix Care
the (normalized) basis vectors of a new color space in XYZ coordinates. C1 is the inverse
ofC. This means that given some color vector v in XYZ coordinates,C1vcomputes the
coordinates of the same color vector expressed in the new color spaces coordinate system.
To compute a non von Kries-ness error, we fix a canonical illuminantEand com-
pute descriptors dW,R
E for every test materialR. We then choose some different illuminant
Eand again compute descriptors dW,R
E for every test materialR. Errors for every choice of
EandR are computed as:
Error=100 ||d
W,RE dW,RE ||||dW,R
E ||(2.13)
For each instance of the experiment, we choose one SFU test illuminant E and
compute the errors over all test materials (canonical illuminant E is kept the same for
each experimental instance). Each time, the color basis derived from running our method
(labeled as Optimized) performed the best.
To visualize the results of the experimental test phase, we plot histograms of %
color vectors that are mapped correctly with a diagonal transform versus the % allowable er-
ror. Hence, each histogram plot requires the specification of two illuminantsone canonical
and one test. Figure 2.6 shows the cumulative histograms for instances in which the stated
basis performs the best and worst relative to the next best basis. Relative performance be-
tween two bases is measured as a ratio of the areas under their respective histogram curves.
The entire process is then repeated for another canonicalE to give a total of 4 graphs.
-
8/11/2019 Ham Thesis
48/192
Chapter 2: A Basis for Color Constancy 37
Figure 2.6: Percent vectors satisfying von Kries mapping versus percent allowable error.
Each curve represents a different choice of color space. For low dimensional worlds, the
dimension of the illuminant space precedes the dimension of the material space in the
abbreviated notation. Low dimensional approximations were obtained either by runningPCA on spectra or by tensor methods described in text. We show the experimental instances
in which our derived basis performs the best and worst relative to the next best basis. The
left and right halves differ in choice of canonical illuminant for testing. Unlike Barnard,
our method effectively optimizes all pairs of lights.
White patch normalization results
The optimization algorithm labeled as Barnard requires specification of a canonical illu-
minant and chooses a color basis such that the mapping between any other test illuminant
and the canonical one is as diagonal as possible (in a least squares sense).
To be fair, we show two sets of results: one where the canonical illuminant during
test matches the canonical illuminant used in Barnard optimization; and one set where a
different canonical illuminant is chosen during test from that used in Barnard optimization.
The second canonical illuminant is chosen to illustrate the best worst-case relative perfor-
mance of our algorithm. Goodness is measured as the ratio of areas under the histogram
curves.
Barnards algorithm performs close to ours for some pairings of test and canoni-
-
8/11/2019 Ham Thesis
49/192
Chapter 2: A Basis for Color Constancy 38
cal illuminants, but is outperformed in most cases. These results are explained theoretically
by the fact that, even though it optimizes over multiple light-pairs, the method of Barnard
et al. always optimizes with respect to a single canonical illuminant. In contrast, we effec-
tively optimize over all possible pairs of lights.
As noted earlier, we also tested against Finlaysons database sharpening method
[23] (using PCA on lights to handle multiple lights). The results were nearly identical to
both of the 2-3 methods.
2.5.4 White Balancing
In this section, we discuss a rough perceptual validation we performed. We first down-
loaded a hyperspectral image from the database provided by [25]. Each pixel of the hyper-
spectral image gives a discretized set of samples for the per-wavelength reflectance function
of the material seen at that pixel. The particular image we used was that of a flower shown
in Figure 2.7. Assuming a lighting model in which viewed rays are simply computed as the
product of the illuminant and reflectance spectra in the wavelength domain, we rendered the
scene as it would appear under a tungsten light bulb and as it would appear under daylight.
For visualization we converted all images into RGB coordinates and gamma corrected each
coordinate before display. The goal of the algorithm was to start with the input image of
the flower under a tungsten light and transform it into the target image of the flower under
daylight using a generalized diagonal model. The goodness of a color basis is judged by the
degree to which it facilitates the transformation of the input into the desired output (with
the significance of discrepancies determined by human perceptual error).
-
8/11/2019 Ham Thesis
50/192
Chapter 2: A Basis for Color Constancy 39
(a) Input. (b) Target.
(c) Cone. (d) 2-3 Tensor. (e) Ours.
Figure 2.7: Color correction example. Diagonal mappings are used in three different color
spaces to attempt a mapping of the input image (a) to the target illumination whose truth
value is shown in (b).
We ran three different algorithms (Cone, 2-3 tensor, and Ours) on the SFU database
to derive three optimized color bases for diagonal color constancy. Instead of choosing one
particular algorithm for then computing diagonals, we wanted to characterize some notion
of best achievable performance under each basis. We simulated the best learned di-
agonals with the following steps: transform the SFU derived measurement tensor by C1
to obtain color coordinates in the candidate color basis; then for each material, compute
the diagonal matrix mapping the color coordinates under the tungsten light bulb to the
corresponding material color under daylight illumination (just component-wise ratios); fi-
nally, average the diagonal matrices over all materials to get the overall diagonal mapping
between the two illuminants under the candidate color basis.
-
8/11/2019 Ham Thesis
51/192
Chapter 2: A Basis for Color Constancy 40
Figure 2.7 shows the output images for the three different color bases. The dif-
ferences are subtle, but one can see that the cone basis yields leaves that are too green, and
the 2-3 tensor method yields a flower that has too much blue (not enough of the red and
yellow of the flower is present). Our method provides the best match to the ideal target
image.
2.5.5 Constrained Optimization
For miscellaneous reasons, we may also want to constrain our optimized solution in other
ways. For example, in Chapter 3 we seek a color basis in which the implied color gamut
(collection of colors that have only positive components in the color basis) encompasses
the entire set of human-visible colors. There, we also choose to measure perceptual error
using the CIEDE2000 difference equation instead of using the standard 2 error in a linear
color space. These extra constraints make each step of the alternating process of Section
2.3 a nonlinear least squares optimization. To handle the numerics, we used the Levenberg-
Marquardt nonlinear least squares algorithm as implemented by [38] along with a wrapper
function that enforces constraints.
The green dot in Figure Figure 2.4 represents the approximation error for the
rank 3 tensor in which the color basis was constrained such that the implied color gamut
would include the entire set of human-visible colors. If we use the green dot in place of
the red dot for the rank 3 approximation, we still get a good approximation of the SFU
tensor. The constrained rank 3 tensors approximation error is 10.7% of the rank 2 tensors
approximation error. The rank 4 tensors approximation error is 19.5% of the constrained
-
8/11/2019 Ham Thesis
52/192
Chapter 2: A Basis for Color Constancy 41
rank 3 tensors approximation error. We do not plot green dots for other tensor ranks
because the gamut constraint is not so well-defined in those cases.
The color basis transform resulting from the constrained optimization is given by
the following (which maps XYZ color vectors to the new color space):
C1 =
9.465229101 2.946927101 1.313419101
1.179179101 9.929960101 7.371554103
9.230461102
4.645794102
9.946464101
(2.14)
See Figure 2.5 for the effective von Kries sensors, which are labeled as Alternating Non-
linear Least Squares (ANLS).
In Chapter 3, we are particularly interested in the ability of the diagonal model to
describe the effect of illumination change in the constrained color basis (whose gamut in-
clues all the human-visible colors). Unlike the white patch normalization experiment which
measured some notion of relational color constancy, we would like to directly measure the
goodness of diagonal color constancy itself. To do this, we simulate a perfect white bal-
ancing algorithm that perfectly maps the color of a standard white material under one
illuminant to the color of the same standard material seen under a different illuminant.
In a von Kries compatible world, the same diagonal used to perform this mapping would
correctly map all material colors under the initial illuminant to their corresponding colors
under the second illuminant. We therefore apply the same mapping to each of the material
colors under the first illuminant to derive predicted colors for the second illuminant. De-
viations from the actual measured colors under the second illuminant then give a measure
-
8/11/2019 Ham Thesis
53/192
Chapter 2: A Basis for Color Constancy 42
Median and RMS Diagonal Color Constancy Error for Constrained Color Basis
Tung. Tung.+ 3500K 3500K+ 4100K 4100K+ 4700K 4700K+
Tung. 1.67 0.54 2.23 1.14 2.72 1.74 3.202.28 0.84 2.98 1.54 3.63 2.31 4.34
Tung.+ 1.46 1.07 0.54 0.60 1.02 0.31 1.54
2.32 1.69 0.80 0.97 1.40 0.55 2.11
3500K 0.51 1.13 1.68 0.57 2.16 1.19 2.67
0.85 1.60 2.24 0.79 2.92 1.59 3.65
3500K+ 1.88 0.52 1.53 1.07 0.58 0.66 1.14
2.97 0.80 2.32 1.67 0.77 1.09 1.55
4100K 0.96 0.60 0.52 1.11 1.60 0.61 2.10
1.53 0.88 0.82 1.53 2.18 0.84 2.89
4100K+
2.35 1.04 2.05 0.60 1.62 1.14 0.58
3.58 1.44 3.01 0.79 2.39 1.71 0.81
4700K 1.41 0.29 1.05 0.63 0.58 1.10 1.59
2.17 0.49 1.55 0.96 0.82 1.51 2.17
4700K+ 2.84 1.60 2.57 1.23 2.18 0.61 1.70
4.24 2.18 3.77 1.58 3.16 0.80 2.43
Table 2.1: Results from the white balancing experiment. Top error is median error, bottom
error is RMS error. Errors are measured in CIEDE2000 units. Row illuminant is the one
that is chosen as canonical. The illuminants are taken from the SFU dataset: Tung. is
a basic tungsten bulb (Sylvania 50MR16Q 12VDC); the different temperatures correspond
to Solux lamps of the marked temperatures; the + means that a Roscolux 3202 Full Blue
filter has been applied.
of the non-von Kries-ness of the world. This experiment also has the advantage of allow-
ing error to be measured via the CIEDE2000 difference equation, giving some perceptual
meaning to the resulting numbers. This also allows us to characterize performance in some
absolute sense.
More specifically, we use the following procedure to report results. We choose
some illuminant as a canonical illuminant. We also choose the one material in the SFU
database labeled as white for our standard white material. Then, for every other (test)
illuminant, we compute the mapping that perfectly maps the standard white under the
-
8/11/2019 Ham Thesis
54/192
Chapter 2: A Basis for Color Constancy 43
canonical illuminant to its color under the test illuminant and is diagonal in our constrained
color basis. We then apply this mapping to all the material colors under the canonical illu-
minant to generate predictions for colors under the test illuminant. Errors between predic-
tions and actual measured values in the SFU database are measured using the CIEDE2000
difference equation. This gives us, for each test illuminant, an error value associated with
every material. So for each test illuminant, we report the median CIEDE2000 error and the
root-mean-squared CIEDE2000 error of the material colors. See Figure 2.1 for the results.
2.6 Discussion
We have argued for a new data-driven choice of color basis for diagonal color constancy
computations. We show that with respect to some existing metrics, the new choice leads to
a better diagonal model.
While a linear change of color basis poses no problem to those concerned sim-
ply with algorithmic modeling, those who seek relevance to human biological mechanisms
might object (on theoretical grounds) that sensor measurement acquisition may involve
nonlinearities that disrupt the brains ability to linearly transform the color basis down-
stream. Fortunately, experimental results based on single-cell responses and psychophysi-
cal sensitivity suggest that any existing nonlinearities at this level are negligible [41, 57].
-
8/11/2019 Ham Thesis
55/192
Chapter 3
A New Color Space for Image Processing
In this chapter, we motivate the need for a new color space primarily by the desire to
perform illumination-invariant image processing, in which algorithm outputs are not so
sensitive to the illumination conditions of their input images. Simultaneously, we also
seek a color space in which perceptual distances can be computed easily. We show that
these desires relate very naturally to notions in perceptual science, andwith one additional
assumptionfully lock down the form of the color space parameterization. We fit the re-
maining parameters to experimental data and apply our new color space to some examples
to illustrate its utility.
44
-
8/11/2019 Ham Thesis
56/192
Chapter 3: A New Color Space for Image Processing 45
3.1 Introduction
While existing color spaces address a range of needs, none of them simultaneously capture
two notable properties required by a large class of applications that includes segmentation
and Poisson image editing [48]. In this work, we present a new color space designed
specifically to address this deficiency.
We propose the following two color space desiderata for image processing:
(1) difference vectors between color pixels are unchanged by re-illumination;
(2) the2 norm of a difference vector matches the perceptual distance between the two
colors.
The first objective restricts our attention to three-dimensional color space parameteriza-
tions in which color displacements, or gradients, can be computed simply as component-
wise subtractions. Furthermore, it expresses the desire for these color displacementsthe
most common relational quantities between pixels in image processingto be invariant to
changes in the spectrum of the scene illuminant. Illumination invariance is useful for ap-
plications where processing is intended to operate on intrinsic scene properties instead of
intensities observed under one particular illuminant. For example, Figure 3.3 (page 58)
shows that when segmenting an image using usual color spaces, images that differ only in
the scenes illumination during capture can require nontrivial parameter tweaking before
the resulting segmentations are clean and consistent. And even then, the segmentations are
not necessarily reliable. Figure 3.4 (page 61) illustrates the extreme sensitivity a Poisson
image editing algorithm exhibits when the illuminant of the foreground object does not
-
8/11/2019 Ham Thesis
57/192
Chapter 3: A New Color Space for Image Processing 46
match the backgrounds illumination.
The second condition implies that the standard computational method of measur-
ing error and distance in color space should match the perceptual metric used by human
viewers.
These desiderata have direct correspondence to widely-studied perceptual no-
tions that possess some experimental support. Desideratum (1) corresponds to subtractive
mechanisms in color image processing and to human color constancy. Desideratum (2)
relates to the approximate flatness of perceptual space.
Subtractive mechanisms refer to the notion that humans perform spatial color
comparisons by employing independent processing per channel [33, 43] and that such per-
channel comparisons take a subtractive form. Physiological evidence for subtraction comes
from experiments such as those revealing lateral inhibition in the retina [18] and the exis-
tence of double opponent cells in the visual cortex (where each type provides a spatially
opponent mechanism for comparing a select chromatic channel) [46].
Color constancy is described in Section 1.1.1 and analyzed in detail in Chapter 2.
Sometimes the term chromatic adaptation is used instead to emphasize the inability to
achieve perfect constancy [18].
The approximate flatness of perceptual space refers to the relative empirical
successas