Ham Thesis

download Ham Thesis

of 192

Transcript of Ham Thesis

  • 8/11/2019 Ham Thesis

    1/192

    Geometric Methods in Perceptual Image Processing

    A dissertation presented

    by

    Hamilton Yu-Ik Chong

    to

    The School of Engineering and Applied Sciences

    in partial fulfillment of the requirements

    for the degree of

    Doctor of Philosophy

    in the subject of

    Computer Science

    Harvard University

    Cambridge, Massachusetts

    May 2008

  • 8/11/2019 Ham Thesis

    2/192

    c2008 - Hamilton Yu-Ik Chong

    All rights reserved.

  • 8/11/2019 Ham Thesis

    3/192

  • 8/11/2019 Ham Thesis

    4/192

    Abstract iv

    derstanding. We single out suggestive contours and illumination valleys as particularly

    interesting because although one is defined in terms of three-dimensional geometry and the

    other in terms of image features, the two produce strikingly similar results (and effectively

    convey a sense of shape). This suggests that the two types of curves capture similar pieces

    of geometric information. To explore this connection, we develop some general techniques

    for recasting questions about the image as questions about the surface.

  • 8/11/2019 Ham Thesis

    5/192

    Contents

    Title Page . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . i

    Abstract . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . iii

    Table of Contents . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . v

    Citations to Previously Published Work . . . . . . . . . . . . . . . . . . . . . . viii

    Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ix

    Dedication . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xi

    1 Introduction 1

    1.1 Perceptual Image Processing . . . . . . . . . . . . . . . . . . . . . . . . . 1

    1.1.1 Color Constancy . . . . . . . . . . . . . . . . . . . . . . . . . . . 3

    1.1.2 A Perception-based Color Space . . . . . . . . . . . . . . . . . . . 7

    1.1.3 Shape Perception and Line Drawings . . . . . . . . . . . . . . . . 9

    1.2 Contributions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 111.3 Overview of Thesis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13

    2 A Basis for Color Constancy 16

    2.1 Introduction and Previous Work . . . . . . . . . . . . . . . . . . . . . . . 16

    2.2 Theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19

    2.2.1 Measurement Constraints . . . . . . . . . . . . . . . . . . . . . . . 22

    2.3 Color Basis for Color Constancy . . . . . . . . . . . . . . . . . . . . . . . 25

    2.4 Relationship to Previous Characterizations . . . . . . . . . . . . . . . . . . 27

    2.5 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30

    2.5.1 Effective Rank of the World . . . . . . . . . . . . . . . . . . . . 30

    2.5.2 Von Kries Sensors . . . . . . . . . . . . . . . . . . . . . . . . . . 312.5.3 White Patch Normalization . . . . . . . . . . . . . . . . . . . . . . 33

    2.5.4 White Balancing . . . . . . . . . . . . . . . . . . . . . . . . . . . 38

    2.5.5 Constrained Optimization . . . . . . . . . . . . . . . . . . . . . . 40

    2.6 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43

    3 A New Color Space for Image Processing 44

    3.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45

    v

  • 8/11/2019 Ham Thesis

    6/192

    Contents vi

    3.2 Previous work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48

    3.3 A Perceptual Metric . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50

    3.3.1 Formalizing the Color Space Conditions . . . . . . . . . . . . . . . 50

    3.3.2 Form of the Metric . . . . . . . . . . . . . . . . . . . . . . . . . . 51

    3.3.3 Parameter Estimation . . . . . . . . . . . . . . . . . . . . . . . . . 52

    3.4 Metric Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56

    3.5 Applications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57

    3.5.1 Segmentation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58

    3.5.2 Poisson Image Editing . . . . . . . . . . . . . . . . . . . . . . . . 60

    3.6 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65

    4 Differential Geometry 67

    4.1 Manifolds, Tensors, and Calculus . . . . . . . . . . . . . . . . . . . . . . . 68

    4.1.1 Manifolds . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 694.1.2 Tensor Fields . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70

    4.1.3 Integrals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79

    4.1.4 Derivatives . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 83

    4.2 Intrinsic and Extrinsic Properties of Surfaces . . . . . . . . . . . . . . . . 89

    4.2.1 First Fundamental Form . . . . . . . . . . . . . . . . . . . . . . . 89

    4.2.2 Second Fundamental Form . . . . . . . . . . . . . . . . . . . . . . 90

    4.2.3 Tensors that live on the surface itself . . . . . . . . . . . . . . . . . 92

    4.2.4 Higher Order Derivatives . . . . . . . . . . . . . . . . . . . . . . . 94

    4.3 The Metric Equivalence Problem . . . . . . . . . . . . . . . . . . . . . . . 96

    5 Shapes from Curves 985.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 99

    5.2 Curve Definitions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 101

    5.2.1 Surface-only Curves . . . . . . . . . . . . . . . . . . . . . . . . . 102

    5.2.2 Environment-dependent Curves . . . . . . . . . . . . . . . . . . . 106

    5.3 Relations Between Image and Surface Curves . . . . . . . . . . . . . . . . 116

    5.3.1 Image Plane and Orthographic Projection . . . . . . . . . . . . . . 116

    5.3.2 Critical Points of Illumination . . . . . . . . . . . . . . . . . . . . 124

    5.3.3 Saint-Venant and Suggestive Energies . . . . . . . . . . . . . . . . 126

    5.3.4 Suggestive Contours and Shading . . . . . . . . . . . . . . . . . . 128

    5.4 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 129

    6 Conclusions and Future Work 130

    Bibliography 133

    A Color Constancy Proofs 138

    A.1 Conditions for Success . . . . . . . . . . . . . . . . . . . . . . . . . . . . 138

  • 8/11/2019 Ham Thesis

    7/192

    Contents vii

    A.1.1 Proof of Proposition 1 . . . . . . . . . . . . . . . . . . . . . . . . 138

    A.1.2 Proof of Lemma 1 . . . . . . . . . . . . . . . . . . . . . . . . . . 141

    A.2 The Space of Solutions . . . . . . . . . . . . . . . . . . . . . . . . . . . . 145

    B Color Space Proofs 147

    B.1 Deriving the Functional Form . . . . . . . . . . . . . . . . . . . . . . . . . 147

    B.2 Recovering Webers Law . . . . . . . . . . . . . . . . . . . . . . . . . . . 149

    C More Differential Geometry 150

    C.1 Non-coordinate Bases . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 150

    C.2 Structure Equations for Surfaces . . . . . . . . . . . . . . . . . . . . . . . 154

    C.3 Theorema Egregium . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 159

    D Image and Surface Curves 160

    D.1 Basic Facts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 160

    D.2 Screen Coordinate Vector Fields . . . . . . . . . . . . . . . . . . . . . . . 166

    D.3 Curve Relations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 168

    D.4 Principal Directions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 175

    D.5 Apparent Curvature . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 176

    D.6 General Lighting . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 180

  • 8/11/2019 Ham Thesis

    8/192

    Citations to Previously Published Work

    Large portions of Chapter 2 on color constancy have previously appeared in:

    The von Kries Hypothesis and a Basis for Color Constancy.

    H. Y. Chong, S. J. Gortler, T. Zickler.

    InProceedings of ICCV 2007.

    The perceptual optimization of a basis for color constancy (Section 2.5.5) and the develop-

    ment of a new color space as presented in Chapter 3 have previously appeared in:

    A Perception-based Color Space for Illumination-invariant

    Image Processing.

    H. Y. Chong, S. J. Gortler, T. Zickler.

    InProceedings of SIGGRAPH 2008.

    viii

  • 8/11/2019 Ham Thesis

    9/192

    Acknowledgments

    I would first like to thank my advisor Professor Steven Gortler for his support

    over the years (during both undergraduate and graduate days). From the beginning, he

    gave me great freedom in choosing topics to pursue and provided excellent guidance on

    how to approach any chosen problem. I would also like to thank Professors Todd Zickler,

    Fredo Durand, Roger Brockett, and Craig Gotsman, all of whom have served on at least

    one of my oral exam committees (and read my various scribbles). They have served as

    wonderful sounding boards for my (often outlandish) ideas and thoughts. Thanks also to

    Professors Doug DeCarlo and Szymon Rusinkiewicz for very helpful discussions on curves

    and surfaces. And thanks as well to Brian Guenter for his mentorship at Microsoft Research

    and for giving me the opportunity to broaden my research exposure and interact with the

    amazing full-timers, postdocs, and fellow interns there.

    Graduate school is of course a sometimes trying experience, and without the mu-

    tual support of friendsall going through similar trials and tribulationsthe hurdles would

    likely feel quite insurmountable. So special thanks to Michael Kellermann, Yi-Ting Huang,

    Jimmy Lin, Daniel DeSousa, Eleanor Hubbard, Leo Nguyen, Michelle Gardner, James

    Black, Mark Hempstead, and Camilo Libedinsky, who put up with my antics outside

    of lab and joined me for sports and gaming. Many thanks as well to lab-mates Danil

    Kirsanov, Guillermo Diez-Canas, Brenda Ng, Loizos Michael, Philip Hendrix, Ece Ka-

    mar, Geetika Lakshmanan, Doug Nachand, Yuriy Vasilyev, Fabiano Romiero, Emmanuel

    Turquin, Christopher Thorpe, Charles McBrearty, Zak Stone, Kevin Dale, Kalyan Sunkavalli,

    Moritz Baecher, Miriah Meyer, Christian Ledergerber, and Forrester Cole, who made the

    lab an inviting place. Thanks also to David Harvey, Wei Ho, and Ivan Petrakiev for their

    ix

  • 8/11/2019 Ham Thesis

    10/192

    Acknowledgments x

    math pointers (and conversations outside of math as well). Thanks also to my friends in

    and around Saratoga.

    And of course, profuse thanks goes to my family for their constant encourage-

    ment (despite their teasing me about the cold California winters). I certainly would not

    have gotten far at all without their support. I am reminded of a Chinese adage (popularized

    by Yao Ming I believe): How do I thank my family? How does a blade of grass thank

    the sun? Well, I do not have an answer to that question, so I will have to leave it at just

    Thanks!

    Thanks again to everyone (and apologies to the many other deserving people Ive

    left unmentioned)!

  • 8/11/2019 Ham Thesis

    11/192

    Dedicated to my parents Fu-Chiung and Kathleen Chong,

    to my brothers Sanders and Anthony Chong,

    and to all teachers.

    xi

  • 8/11/2019 Ham Thesis

    12/192

    Chapter 1

    Introduction

    1.1 Perceptual Image Processing

    Perception may roughly be defined as the minds process of disentangling information from

    the physical means by which it is conveyed. As such, the study of perception is inherently a

    study of abstract representations. In distilling information into abstract quanta, the human

    mind prepares such information for conscious processing and consumption. Invisualper-

    ception, the inputs are retinal images and the outputs are mental descriptions (e.g., shape

    and material properties) of the subject being observed.

    On a spectrum of scales at which to probe human perception, the traditional

    endeavors of cognitive psychologists and visual system neuroscientists may coarsely be

    described as sitting at the two extremes. The former examines perceptual issues on a qual-

    itative and macroscopic level while the latter identifies the microscopic biological building

    1

  • 8/11/2019 Ham Thesis

    13/192

    Chapter 1: Introduction 2

    blocks that allow for physical realization. Computational vision glues these two ends of the

    spectrum together by providing an algorithmic description of how the functional building

    blocks may give rise to the qualitative behaviors observed and classified through cognitive

    experiments. Such a process (i.e., the algorithmic transformation of information) can be

    studied independent of any of its physical instantiations, and hence falls within the purview

    of computer science.

    The aim of this dissertation is to propose new algorithmic models for aspects of

    perceptual visual processing. While these models should replicate important features of

    human vision, we by no means expect our models to fully describe reality. Rather, they are

    simply meant to provide first order quantitative predictions; such predictions can then

    be used to either bolster theories or point out assumptions in need of greater refinement.

    One of our hopes is that these models might inform future experimental designs and aid

    (even if only as a distracting, but somewhat informative, detour) in sciences ongoing and

    successive approximations to truth.

    Ultimately, however, our primary aim in this work is not to predict actual human

    behavior. Our main interest is in the design of new algorithms that assist computers in at-

    tacking various problems in graphics and vision. Motivation for this comes from the robust

    manner in which the humans cope with environmental variations and successfully com-

    plete a variegated set of visual tasks. By constructing new perception-inspired algorithms,

    we may be able to endow computers with comparable robustness in accomplishing similar

    goals. Therefore, the utility of these models will be judged by their usefulness in algorith-

    mic problem solving; their value (as presented here) stands independent of the goodness

    of our assumptions on how humans actually process visual information. In this work, we

  • 8/11/2019 Ham Thesis

    14/192

    Chapter 1: Introduction 3

    seek progress on three problems: achieving color constancy, designing an illumination-

    invariant color space, and relating projected image data to three-dimensional shape under-

    standing.

    1.1.1 Color Constancy

    Color constancy refers to the fact that the perceived color of an object in a scene tends to

    remain constant even when the spectral properties of the illuminant (and thus the tristimulus

    data recorded at the retina) are changed drastically.

    Figure 1.1: Arrows indicate pixel intensity values that have been replicated as uniform

    squares of color outside the image context for side-by-side comparison with pixel intensity

    values from the other image.

    Figure 1.1 illustrates a local version of the effect. The left half of the figure

    shows a set of puppets lit by normal daylight. The right half of the figure shows the same

    set of puppets, but after an extreme environmental manipulation has been applied. While

    no human viewer would consider pairs of spatially corresponding pixels between left and

    right halves to be sharing the same color, even under this extreme change, the qualitative

    experience of the colors is mostly preserved. For example, if asked to coarsely label the

  • 8/11/2019 Ham Thesis

    15/192

    Chapter 1: Introduction 4

    color of the center puppets shirt on the right (marked by one of the arrows), one would

    settle upon the descriptor yellow. However, when the pixel RGB values are pulled out

    and displayed against a white background, one instead labels the same RGB values as

    blue. The surprising fact is that we do not simply see the right half of Figure 1.1 as

    consisting only of various shades of blue or purple. We still experience, for example,

    sensations of yellow (which is widely considered in vision science the color most opposite

    that of blue [46]).

    Color constancy is even more powerful when the same environmental manipula-

    tion is applied to our entire visual field (i.e., is applied globally rather than locally). In such

    cases, we do not have the bright white of the paper (or the original image colors) to deduce

    the overall blue-ness of the new environment. For a common, simulated example of such

    a global manipulation, consider the eventual color shift that occurs after putting on a pair

    of ski goggles. The act of putting on goggles can be thought of as simulating a change

    of illuminant to one in which the there is less power in the short wavelengths. At first,

    putting on a pair ski goggles with an orange filter makes the world appear orange. How-

    ever, over time, the orange-ness disappears and colors are largely perceived as if goggles

    had not been worn. When the goggles are later taken off, the world immediately appears

    exceedingly blue until color perception once again shifts into an adapted state.

    This phenomenon results from some kind of global processing that is still not

    completely understood. One working model for achieving constancy is that the human first

    somehow estimates the illuminant. The effect of illumination is then undone by applying

    a tone mapping operator. This yields colors that are perceived as being lit under some

    canonical illuminant. This interpretation may be referred to as illumination discounting

  • 8/11/2019 Ham Thesis

    16/192

    Chapter 1: Introduction 5

    [18]. Another working model is that the human processes retinal images by relating colors

    of observed objects. In converting retinal signals into perceived responses, the brain only

    uses relationships that are not affected by illumination changes. Such an interpretation may

    be referred to asspatial processing[18,43].

    Whatever the mechanism, this perceptual processing allows color to be treated as

    an intrinsic material property. From our day-to-day perceptual experiences, we accept it as

    sensible to refer to an apple as red, or an orange as orange. This, however, belies the fact

    that color constancy is actually hard to achieve. Recall that the light seen reflected from

    objects depends on the full spectrum of the illuminant and the per-wavelength attenuation

    due to the material. Light is only turned into three sensory responses once it hits the eye.

    Therefore, given scenes with arbitrary illuminant spectra and material reflectances, there is

    no reason that constancy should be even close to possible. Consider the scenario in which

    two different materials appear the same under one illuminant (i.e., they aremetamers), but

    look different under a second illuminant. In this case, one color maps to two distinct colors

    with illuminant change, so color relations clearly do not remain constant.

    Despite the manifest difficulty in achieving color constancy, models of color con-

    stancy are often furthermore forced to obey an even more stringent constraint, the gener-

    alized von Kries hypothesis [21]. This hypothesis stipulates that color correction is done

    using independent scalings of the three tristimulus values. Von Kries commonly adopted

    model assumes that the effect of changing illumination is to multiply each of our trichro-

    matic sensor responses by a separate (possibly spatially varying) scale factor, and the brain

    is thus able to discount the change by applying the inverse scale factor [18, 33, 46]. The

    generalizedvon Kries model permits a change of color basis to take place before the gain

  • 8/11/2019 Ham Thesis

    17/192

    Chapter 1: Introduction 6

    factors are applied.

    With such demanding restrictions, color constancy may seem a vain endeavor. In-

    deed, humans do not have perfect color constancy; however, our color percepts are nonethe-

    less surprisingly stable [46]. This suggests that the worlds illuminants and reflectances

    are not completely arbitrary. One natural question is then, What are the necessary and

    sufficient conditions under which constancy is in fact achievable? Answers to this are

    discussed in Chapter 2. There, we will focus mostly on the generalized von Kries model. A

    related question is, To what extent are these conditions true in our everyday world? We

    will present experimental tests of this as well. Note that these questions concern the gen-

    eralpossibilityof achieving color constancy with any von Kries-like approach. Therefore,

    answers to these questions provide blanket statements that upper bound what is attainable

    for any method employing per-channel gain control for color constancy (e.g., grey world,

    grey edge [54], gamut mapping [24], retinex [33]). They do not address what algorithm in

    particular best achieves color constancy under these conditions.

    Given that the necessary conditions for generalized von Kries models to achieve

    perfect color constancy may not be met exactly, we would also like to compute an opti-

    mal basis in which to run von Kries-based color constancy algorithms. The problem of

    computing such a basis is addressed in Section 2.3. As a follow-up question we also pose

    the following: To what extent is there evidence that humans process under the basis we

    compute? This question is left unanswered in this thesis.

    With regards to applications, models for color constancy can be applied to color

    correction problems such as white balancing. In white balancing, a scene is captured under

  • 8/11/2019 Ham Thesis

    18/192

    Chapter 1: Introduction 7

    one ambient illuminant. The image is then later displayed, but under viewing conditions

    that employ a different ambient illuminant. The mismatch in illuminants during capture and

    display, and hence mismatch in observer adaptations in both cases, causes discrepancies

    in the perceived colors of the displayed reproduction. A white balanced image is one in

    which the captured colors are mapped to colors as seen under the new display illuminant.

    By displaying the white balanced image, a viewer perceives the reproduced scene in the

    same way the one who captured the image perceived the original scene.

    1.1.2 A Perception-based Color Space

    Given the importance of color processing in computer graphics, color spaces abound.

    While existing color spaces address a range of needs, they all suffer from one notable

    problem that makes them unsatisfactory for a large class of applications including segmen-

    tation and Poisson image editing [48]: algorithms working in these color spaces exhibit

    great sensitivity to the illumination conditions under which their inputs were captured.

    Consider Figure 1.2 in which an active contour segmentation algorithm is run

    on two images that differ only in the scenes illumination (these were generated using full-

    spectral images and measured illuminant spectra). Figure 1.2a shows the two initial images.

    Figure 1.2b shows their segmentations in a typical color space (the space of RGB values

    provided by the image). Figure 1.2c shows their segmentations in another common color

    space (CIELab). In both color spaces, the segmentation parameters were tuned on each

    image to produce as clean and consistent segmentations as possible. Furthermore, even

    after parameter tweaking, the segmentations do not appear particularly reliable. It would

  • 8/11/2019 Ham Thesis

    19/192

    Chapter 1: Introduction 8

    (a) Pair of images to segment. (b) One typical color space. (c) Another color space.

    Figure 1.2: Active contour segmentation of two images in two different color spaces with

    parameters tweaked for each image so that segmentations are as clean and consistent as

    possible. Optimal parameters for the image pairs are separated by 2 and 5 orders of mag-

    nitude respectively.

    be beneficial to have a segmentation algorithm that could use the same set of parameters to

    produce consistent segmentations of the two images in spite of their differing illuminations.

    Figure 1.3 shows another example, this time of Poisson image editing. Here, a

    picture of a bear is cut out from one image (Figure 1.3a) and inserted into a background

    image of a swimming pool (Figure 1.3b). Figure 1.3c shows the result when run in a usual

    color space (the space of RGB values provided by the image). In this image, the foreground

    object is seamlessly inserted into the background image; however, the bear appears rather

    faded and ghostly. For comparison, Figure 1.3d shows an insertion performed using the

    color space we present in Chapter 3. Here the bear appears much more corporeal.

    These examples point to the fact that algorithms working in current color spaces

    lack an essential robustness to the illumination conditions of their inputs. If processing

    is meant to operate on intrinsic scene properties, then such illumination-dependent results

  • 8/11/2019 Ham Thesis

    20/192

    Chapter 1: Introduction 9

    (a) Object and mask. (b) Background. (c) Typical result. (d) More desirable result.

    Figure 1.3: Poisson image editing.

    are unsatisfactory. Since color constancy deals with taking measured colors and turning

    them into stable percepts, addressing this problem is plausibly related to issues of color

    constancy. Given an algorithm for color constancy, one solution might be to map an images

    input colors to some canonical set of colors (i.e., colors as seen under a standard illuminant)

    and then work in such a space instead. One potential problem with such an approach is that

    it may depend on the ability of the algorithm to perform difficult tasks like estimating the

    illuminant in order to map colors to their canonical counterparts. An even better scenario

    would be if some ability to edit colors in an intrinsic sense were built into the choice of

    color space itselfwithout requiring the adoption of a specific color correction algorithm.

    1.1.3 Shape Perception and Line Drawings

    Shape perception refers to the process of taking image data and inferring the three-dimensional

    geometry of objects within the image. While there are many approaches to estimating shape

    (e.g., shape from shading [29], shape from texture [5], shape from specular flow [2]), we

    shall be interested in the connection between shape understanding and curves in an image.

  • 8/11/2019 Ham Thesis

    21/192

    Chapter 1: Introduction 10

    (a) Silhouettes and suggestive contours. (b) Silhouettes only.

    Figure 1.4: Line renderings of an elephant.

    As shown in Figure 1.4a, line drawings alone are often enough to convey a sense of shape.

    This suggests that curves themselves provide a lot of information about a surface. Artists

    are able to make use of this fact and invert the process; they take a reference object, either

    observed or imagined, and summarize it as a line drawing that is almost unambiguously

    interpretable. Two natural questions then emerge: (1) Given a three-dimensional object,

    what curves should one draw to best convey the shape? (2) Given an image, is there a

    stable set of curves that humans can robustly detect and use to infer shapeand if so, how

    are they defined and what precisely is their information content?

    Analyses of the more classically defined curves on surfaces (e.g., silhouettes,

    ridges, parabolic lines) are inadequate in giving a precise understanding to this problem.

    For example, Figure 1.4b shows that the relationships between curves and shape perception

    involve much more than simply silhouette data. Likewise can be said of the traditionally

    defined ridges and valleys. Researchers have more recently defined curves such as sugges-

    tive contours [15] and apparent ridges [32] which take advantage of view information and

    produce informative renderings, but more precise analyses of when and why they succeed

  • 8/11/2019 Ham Thesis

    22/192

    Chapter 1: Introduction 11

    are still lacking.

    A complete understanding of the set of curves humans latch onto clearly ex-

    tends far beyond the scope of this work; we instead focus on laying some groundwork for

    future investigations. Our guiding philosophy is roughly as follows: humans only have

    retinal images as inputs; assuming curve-based shape understanding is a learned process,

    any meaningful curves should then be detectable (in a stable manner) in retinal images;

    therefore, analysis should be developed for relating curves defined in images to geometric

    properties of the viewed surface. Philosophy aside, the same geometric techniques devel-

    oped for studying such relations may also be applied to answering other sorts of questions

    relating curves on surfaces to curves in images, so we hope these techniques may prove

    useful more generally.

    1.2 Contributions

    Color Constancy. Color constancy is almost exclusively modeled with von Kries (i.e.,

    diagonal) transforms [21]. However, the choice of basis under which von Kries transforms

    are taken is traditionallyad hoc. Attempts to remedy the situation have been hindered by

    the fact that no joint characterization of the conditions for worlds 1 to support (generalized)

    von Kries color constancy has previously been achieved. In Chapter 2, we establish the

    following:

    1By a world, we mean a collection of illuminants, reflectances, and visual sensors. These are typically

    described by a database of the spectral distributions for the illuminants, per-wavelength reflections for the

    materials, and per-wavelength sensitivities for the sensors.

  • 8/11/2019 Ham Thesis

    23/192

    Chapter 1: Introduction 12

    Necessary and sufficient conditions for a world to support doubly linear color con-

    stancy.

    Necessary and sufficient conditions for a world to support generalized von Kries

    color constancy.

    An algorithm for computing a locally optimal color basis for generalized von Kries

    color constancy.

    In the applications we discuss, we are mostly concerned with the cone sensors of the human

    visual system. However, the same theory applies for camera sensors and can be generalized

    in a straightforward manner for a different number of sensors.

    A Perception-based Color Space. In Chapter 3, we design a new color space with the

    following properties:

    It has a simple 3D parameterization. Euclidean distances approximately match perceptual distances.

    Color displacements and gradients are approximately preserved under changes in the

    spectrum of the illuminant.

    Coordinate axes have an interpretation in terms of color opponent channels.

    It can easily be integrated with fibered space and measure-theoretic image models.

    The first three bullet points imply the color space can easily be plugged into many image

    processing algorithms (i.e., algorithms can be called as black-box functions and require no

    change in implementation). This color space also enables these algorithms to work on the

  • 8/11/2019 Ham Thesis

    24/192

    Chapter 1: Introduction 13

    intinsic image and thereby yield (approximately) illumination-invariant results.

    Shape Perception and Line Drawings. To study shape, we present some technical tools

    in the context of relating Saint-Venant curves to suggestive contours. Contributions include

    the following:

    Choice of bases for surface and image that simplifies calculations and makes expres-

    sions more conducive to interpretation.

    Transplanting of techniques for working with bases that change at every point andnotation for reducing ambiguities.

    General procedure for expressing image information in terms of three-dimensional

    geometric quantities.

    Some relations between suggestive contours, Saint-Venant curves, and surface ge-

    ometry.

    Relation between a surface-curves normal curvature in the tangent direction and its

    apparent curvature in the image.

    1.3 Overview of Thesis

    The organization for the rest of the thesis is as follows:

    Chapter 2. We analyze the conditions under which various models for color constancy can

    work. We focus in particular on the generalized von Kries model. We observe that the von

  • 8/11/2019 Ham Thesis

    25/192

    Chapter 1: Introduction 14

    Kries compatibility conditions are impositions only on the sensor measurements, not the

    physical spectra. This allows us to formulate the conditions succinctly as rank constraints

    on an order three measurement tensor. Given this, we propose an algorithm that computes

    a (locally) optimal choice of color basis for von Kries color constancy and compare the

    results against other proposed choices.

    Chapter 3. Motivated by perceptual principlesin particular, color constancywe derive a

    new color space in which the associated metric approximates perceived distances and color

    displacements capture relationships that are robust to spectral changes in illumination. The

    resulting color space can be used with existing image processing algorithms with little

    or no change to the methods. We show application to segmentation and Poisson image

    processing.

    Chapter 4. This chapter presents the mathematical background for Chapter 5 (where we

    consider the relations between shapes and image curves). The previous chapters make only

    minor references to this chapter, so readers only interested in the color processing portions

    of this work can simply refer to this chapter as required. Chapter 2 makes use of multilinear

    algebra, so reading the notation used for denoting tensors is sufficient. Chapter 3 makes

    reference to the metric equivalence problem discussed in Section 4.3. The more involved

    details are not so important, so a high level overview suffices. Chapter 5 makes full use

    of the presented differential geometry. So readers interested in that chapter should read

    Chapter 4 with more care.

    Chapter 5. We use the formalism presented in Chapter 4 to relate curves detectable in

    images to information about the geometry. We focus in particular on Saint-Venant valleys

  • 8/11/2019 Ham Thesis

    26/192

    Chapter 1: Introduction 15

    and relate them to suggestive contours. We develop techniques for relating image and

    surface information and apply these to characterizing curve behavior at critical points of

    illumination. We also prove more limited results away from critical points of illumination.

    Appendix A. This appendix provides proofs for the results cited in Chapter 2. It also

    contains an analysis of the structure of the space of worlds supporting generalized von

    Kries color constancy.

    Appendix B. This appendix presents the proof that locks down the functional form of

    our color space parameterization. We also prove that our model recovers the well-known

    Webers Law for brightness perception.

    Appendix C. This appendix contains more detailed discussions on differential geometry

    that are somewhat tangential to the main exposition. It includes some calculations that can

    be used to verify some of the claims or get a better sense for how the formalism can be

    used.

    Appendix D. This appendix provides calculations and proofs for the various relations we

    discuss in Chapter 5. It also details some further derivations that may be useful for future

    work.

  • 8/11/2019 Ham Thesis

    27/192

    Chapter 2

    A Basis for Color Constancy

    In this chapter we investigate models for achieving color constancy.We devote particular

    attention to the ubiquitous generalized von Kries model. In Section 2.2.1, we relate the

    ability to attain perfect color constancy under such a model to the rank of a particular order

    three tensor. For cases in which perfect color constancy is not possible, this relationship

    suggests a strategy for computing an optimal color basis in which to run von Kries based

    algorithms.

    2.1 Introduction and Previous Work

    For a given scene, the human visual system, post adaptation, will settle on the same per-

    ceived color for an object despite spectral changes in illumination. Such an ability to dis-

    cern illumination-invariant material descriptors has clear evolutionary advantages and also

    16

  • 8/11/2019 Ham Thesis

    28/192

    Chapter 2: A Basis for Color Constancy 17

    largely simplifies (and hence is widely assumed in) a variety of computer vision algorithms.

    To achieve color constancy, one must discount the effect of spectral changes in

    the illumination through transformations of an observers trichromatic sensor response val-

    ues. While many illumination-induced transformations are possible, it is commonly as-

    sumed that each of the three sensors reacts with a form of independent gain control (i.e.,

    each sensor response value is simply scaled by a multiplicative factor), where the gain fac-

    tors depend only on the illumination change [21, 33]. This is termed von Kries adaptation.

    Represented in linear algebra, it is equivalent to multiplying each column vector of sensor

    response values by ashareddiagonal matrix (assuming spatially uniform illumination), and

    is therefore also referred to as the diagonal model for color constancy.

    Note that while the initial von Kries hypothesis applied only to direct multiplica-

    tive adjustments of retinal cone sensors, we follow [21] and use the term more loosely to

    allow for general trichromatic sensors. We also allow for a change of color basis to occur

    before the per-channel multiplicative adjustment. (Finlayson et al. [21] refer to this as a

    generalized diagonalmodel for color constancy, and they term the change of color basis a

    sharpening transform.)

    The (generalized) diagonal model is at the core of the majority of color constancy

    algorithms. Even a number of algorithms not obviously reliant on the diagonal assumption

    in fact rely on diagonal models following a change of color basis [21, 22]; their choice of

    color basis is simply not explicit. Yet, despite the widespread use of the diagonal model,

    good choices of color bases under which diagonal transforms can be taken are only partially

    understood.

  • 8/11/2019 Ham Thesis

    29/192

  • 8/11/2019 Ham Thesis

    30/192

    Chapter 2: A Basis for Color Constancy 19

    While these limitations have been well-documented, a more complete character-

    ization of the conditions for von Kries compatibility has yet to be established. As a result,

    the development of more powerful systems for choosing optimized color bases has been

    slow. This chapter addresses these issues by answering the following questions:

    (1) What are the necessary and sufficient conditions that sensors, illuminants, and mate-

    rials must satisfy to be exactly von Kries compatible, and what is the structure of the

    solution space?

    (2) Given measured spectra or labeled color observations, how do we determine the color

    space that best supports diagonal color constancy?

    We observe that the joint conditions are impositions only on the sensor measure-

    ments, not the physical spectra. This allows the von Kries compatibility conditions to be

    succinctly formulated as rank constraints on an order 3 measurement tensor. Our analysis

    leads directly to an algorithm that, given labeled color data, computes a locally optimal

    choice of color basis in which to carry out diagonal color constancy computations. The

    proposed framework also unifies most existing analyses of von Kries compatibility.

    2.2 Theory

    We define two notions of color constancy. The first definition captures the idea that a

    single adjustment to the sensors will map all material colors seen under an illuminantE1to

    reference colors under (a possibly chosen standard) illuminant E2. The second definition

    (also known asrelationalcolor constancy) captures the idea that surface colors have a fixed

  • 8/11/2019 Ham Thesis

    31/192

    Chapter 2: A Basis for Color Constancy 20

    relationship between each other no matter what overall illumination lights the scene. As

    stated, these two definitions are not interchangeable. One being true does not imply the

    other.

    To define the issues formally, we need a bit of notation. Let Rbe the smallest

    (closed) linear subspace ofL2 functions enclosing the spectral space of materials of interest.

    Let E be the smallest (closed) linear subspace ofL2 functions enclosing the spectral space

    of illuminants of interest. LetpR,E be the color (in the sensor basis) of material reflectance

    R() R under illuminationE() E. In the following,D and Dare operators that takecolor vectors and map them to color vectors. D is required to be independent of the material

    R; likewise, Dis required to be independent of the illuminant E. Thedenotes the action

    of these operators on color vectors.

    (1) Color constancy:

    For allE1,E2E, there exists a D(E1,E2)such that for allR

    R,

    pR,E2 =D(E1,E2)pR,E1

    (2) Relational color constancy:

    For allR1,R2 R, there exists a D(R1,R2)such that for allE E,

    pR2,E = D(R1,R2)pR1,E

    In the case that D and D are linear (and hence identified with matrices),

    is just matrix-

    vector multiplication. IfD is linear, we say that the world supports linear adaptive color

    constancy. If D is linear, we say the world supports linear relational color constancy. D

    being linear does not imply Dis linear, and vice versa. If both D and Dare linear, we say

    the world supportsdoubly linearcolor constancy.

  • 8/11/2019 Ham Thesis

    32/192

    Chapter 2: A Basis for Color Constancy 21

    (1)

    (2)(3) ...

    (1)

    (2)(3)...

    (1)

    (2)

    (3)

    ji

    k

    ji

    k

    ji

    k

    Figure 2.1: The 3xIxJ measurement tensor. The tensor can be sliced in three ways to

    produce the matrices (j), (i), and (k).

    In particular, we shall be interested in the case when D and D are both further-

    morediagonal(under some choice of color basis). It is proven in [21] that for a fixed color

    space, D is diagonal if and only if D is diagonal. So the two notions of color constancy

    are equivalent if eitherD or D is diagonal, and we say the world supports diagonal color

    constancy (thedoubly modifier is unnecessary). The equivalence is nice because we, as

    biological organisms, can likely learn to achieve definition 1, but seek to achieve definition

    2 for inference.

    Given a set of illuminants {Ei}i=1,...,I, reflectances {Rj}j=1,...,J, and sensor color

    matching functions {k}k=1,2,3, we define a measurement data tensor2 (see Figure 2.1):

    Mki j:=

    k()Ei()Rj()d (2.1)

    For fixed values of j, we get 3xI matrices (j) := Mk

    i j that map illuminants

    expressed in the {Ei}i=1,...,Ibasis to color vectors expressed in the sensor basis. Likewise,

    for fixed values ofi, we get 3xJmatrices (i):=Mk

    i j that map surface reflectance spectra

    expressed in the {Rj}j=1,...,Jbasis to color vectors. We can also slice the tensor by constant2We will use latin indices starting alphabetically with i to denote tensor components instead of the usual

    greek letters to simplify notation and avoid confusion with other greek letters floating around.

  • 8/11/2019 Ham Thesis

    33/192

    Chapter 2: A Basis for Color Constancy 22

    0

    0

    0

    Figure 2.2: Core tensor form: a 3x3x3 core tensor is padded with zeros. The core tensor is

    not unique.

    kto getIxJmatrices (k) :=Mki j.

    Since color perception can depend only on the eyes trichromatic color measure-

    ments, worlds (i.e., sets of illuminant and material spectra) giving rise to the same measure-

    ment tensor are perceptually equivalent. To understand diagonal color constancy, therefore,

    it is sufficient to analyze the space of measurement tensors and the constraints that these

    tensors must satisfy. This analysis of von Kries compatible measurement tensors is covered

    in section 2.2.1.

    Given a von Kries compatible measurement tensor (e.g., an output from the al-

    gorithm in section 2.3), one may also be interested in the constraints such a tensor places

    on the possible spectral worlds. This analysis is covered in section 2.4.

    2.2.1 Measurement Constraints

    The discussion in this section will always assume generic configurations (e.g., color mea-

    surements span three dimensions, color bases are invertible). Proofs not essential to the

    main exposition are relegated to Appendix A.

    Proposition 1. A measurement tensor supports doubly linear color constancy if and only

  • 8/11/2019 Ham Thesis

    34/192

    Chapter 2: A Basis for Color Constancy 23

    if there exists a change of basis for illuminants and materials that reduces it to the core

    tensor form of Figure 2.2.

    More specifically (as is apparent from the proof of Proposition 1 in Appendix A.1.1), if

    a single change of illuminant basis makes all the (j) slices null past the third column,

    the measurement tensor supports linear relational color constancy. Likewise, a change of

    material basis making all the (i)slices null past the third column implies the measurement

    tensor supports linear adaptive color constancy. Support for one form of linear constancy

    does not imply support for the other.

    The following lemma provides a stepping stone to our main theoretical result and

    is related to some existing von Kries compatibility results (see section 2.4).

    Lemma 1. A measurement tensor supports generalized diagonal color constancy if and

    only if there exists a change of color basis such that, for all k, (k) is a rank-1 matrix.

    This leads to our main theorem characterizing the space of measurement tensors

    supporting generalized diagonal color constancy.

    Theorem 1. A measurement tensor supports generalized diagonal color constancy if and

    only if it is a rank 3 tensor.3

    An order 3 tensor (3D data block) T is rankN ifNis the smallest integer such

    that there exist vectors4

    {an,bn,cn

    }n=1,...,N allowing decomposition as the sum of outer

    products (denoted by ):3There exist measurement tensors supporting generalized diagonal color constancy with rank less than 3,

    but such examples are not generic.4In the language of differential geometry introduced in Chapter 4,{an,bn}n=1,...,N are really covector

    component lists.{cn}n=1,...,Nare really vector component lists.

  • 8/11/2019 Ham Thesis

    35/192

    Chapter 2: A Basis for Color Constancy 24

    T=N

    n=1

    cnanbn. (2.2)

    Without loss of generality, let{an} be vectors of length I, corresponding to the

    illuminant axis of the measurement tensor; let{bn} be vectors of length J, corresponding

    to the material axis of the tensor; and let {cn} be vectors of length 3, corresponding to the

    color sensor axis of the tensor. Let the vectors {an} make up the columns of the matrix A,

    vectors

    {bn}

    make up the columns of the matrix B, and vectors

    {cn}

    make up the columns

    of the matrixC. Then the decomposition above may be restated as a decomposition into

    the matrices (A,B,C), each withNcolumns.

    Proof. (Theorem 1). First suppose the measurement tensor supports generalized diagonal

    color constancy. Then by Lemma 1, there exists a color basis under which each (k) is rank-

    1 (as a matrix). This means each (k) can be written as an outer product, (k) =akbk.

    In this color basis then, the measurement tensor is a rank 3 tensor in which the matrixC

    (following notation above) is just the identity.5 We also point out that an invertible change

    of basis (on any ofA,B,C) does not affect the rank of a tensor, so the original tensor (before

    the initial color basis change) was also rank 3.

    For the converse case, we now suppose the measurement tensor is rank 3. Since

    Cis (in the generic setting) invertible, multi-linearity gives us

    C1

    3

    n=1

    cn an bn

    =3

    n=1

    C1cn

    an bn. (2.3)5Technically, this shows that the tensor is at most rank 3, but since we are working with generic tensors,

    we can safely discard the degenerate cases in which observed colors do not span the three-dimensional color

    space.

  • 8/11/2019 Ham Thesis

    36/192

    Chapter 2: A Basis for Color Constancy 25

    The operator on the left hand side of Equation (2.3) denotes the application of the 3 3

    matrixC1 along the sensor axis of the tensor. The right hand side of Equation (2.3) is

    a rank 3 tensor with each (k) slice a rank-1 matrix. By Lemma 1, the tensor must then

    support diagonal color constancy.

    In the proof above, note that the columns ofC exactly represent the desired color basis

    under which we get perfect diagonal color constancy. This theorem is of algorithmic im-

    portance because it ties the von Kries compatibility criteria to quantities (best rank 3 tensor

    approximations) that are computable via existing multilinear methods.

    2.3 Color Basis for Color Constancy

    Given a measurement tensorMgenerated from real-world data, we would like to find the

    optimal basis in which to perform diagonal color constancy computations. To do this,

    we first find the closest von Kries compatible measurement tensor (with respect to the

    Frobenius norm). We then return the color basis that yields perfect color constancy under

    this approximate tensor.

    By Theorem 1, finding the closest von Kries compatible measurement tensor is

    equivalent to finding the best rank 3 approximation. Any rank 3 tensor may be written in

    the form of equation (2.2) withN=3. It also turns out that such a decomposition of a rank

    three tensor into these outer-product vectors is almost always unique (modulo permutations

    and scalings). We solve forMs best rank 3 approximation (decomposition into A, B, C)

    via Trilinear Alternating Least Squares (TALS) [28]. For a rank 3 tensor, TALS forcesA,

  • 8/11/2019 Ham Thesis

    37/192

    Chapter 2: A Basis for Color Constancy 26

    B, and Cto each have 3 columns. It then iteratively fixes two of the matrices and solves for

    the third in a least squares sense.

    Repeating these computations in lockstep guarantees convergence to a local min-

    imum. A, B,Ccan be used to reconstruct the closest von Kries compatible tensor and the

    columns ofCexactly represent the desired color basis.

    As a side note, the output of this procedure differs from the best rank-(3,3,3)

    approximation given by HOSVD [37]. HOSVD only gives orthogonal bases as output and

    the rank-(3,3,3) truncation does not in general yield a closest rank 3 tensor. HOSVD may,

    however, provide a good initial guess.

    The following details on TALS mimic the discussion in [52]. For further infor-

    mation, see [28, 52] and the references therein. The Khatri-Rao product of two matricesA

    andB withNcolumns each is given by

    AB:=a1b1,a2b2, ,aNbN

    , (2.4)

    where is the Kronecker product.

    Denote the flattening of the measurement tensorMby MIJ3 if the elements ofM

    are unrolled such that the rows of matrix MIJ3 loop over the(i,j)-indices withi=1,...,I

    as the outer loop and j= 1,...,Jas the inner loop. The column index of MIJ3 corresponds

    with the dimension of the measurement tensor that is not unrolled (in this casek=1,2,3).

  • 8/11/2019 Ham Thesis

    38/192

    Chapter 2: A Basis for Color Constancy 27

    The notation for other flattenings is defined symmetrically. We can then write

    MJI3 = (BA)CT. (2.5)

    By symmetry of equation (2.5), we can write out the least squares solutions for each of the

    matrices (with the other two fixed):

    A = (BC) MJ3IT , (2.6)

    B =

    (CA) M3IJT

    , (2.7)

    C =

    (BA) MJI3T

    . (2.8)

    2.4 Relationship to Previous Characterizations

    As mentioned in the introduction, there are two main sets of theoretical results. There are

    the works of [20,58] that give necessary and sufficient conditions for von Kries compatibil-

    ity under a predetermined choice of color space, and are able to build infinite dimensional

    von Kries compatible worlds for this choice. Then there are the works of [21, 22] that

    prescribe a method for choosing the color space, but only for worlds with low dimen-

    sional linear spaces of illuminants and materials. We omit direct comparison to the various

    spectral sharpening techniques [6, 16, 23] in this section, as these methods propose more

    intuitive guidelines rather than formal relationships.

    Previous analyses treat the von Kries compatibility conditions as constraints on

  • 8/11/2019 Ham Thesis

    39/192

    Chapter 2: A Basis for Color Constancy 28

    spectra, whereas the analysis here treats them as constraints on color measurements. In this

    section, we translate between the two perspectives. To go from spectra to measurement

    tensors is straightforward. To go the other way is a bit more tricky. In particular, given a

    measurement tensor with rank-1 (k), there is not a unique world generating this data. Any

    set of illuminants{Ei}i=1,...,I and reflectances{Rj}j=1,...,J satisfying Equation (2.1) (with

    M and k fixed) will be consistent with the data. Many constructions of worlds are thus

    possible. But if one first selects particular illuminant or material spectra as mandatory in-

    clusions in the world, then one can state more specific conditions on the remaining spectral

    choices. For more on this, see Appendix A.2.

    In [21, 22], it is shown that if the illuminant space is 3 dimensional and the ma-

    terial space is 2 dimensional (or vice versa), then the resulting world is (generalized) von

    Kries compatible. As a measurement tensor, this translates into stating that any 3x3x2 mea-

    surement tensor is (complex) rank 3. However this 3-2 condition is clearly not necessary

    as almost every rank 3 tensor is not reducible via change of bases to size 3x3x2. In fact, one

    can always extend a 3x3x2 tensor to a 3x3x3 core tensor such that the (k) are still rank-1.

    The illuminant added by this extension is neither black with respect to the materials, nor in

    the linear span of the first two illuminants.

    The necessary and sufficient conditions provided in [58] can be seen as special

    cases of Lemma 1. The focus on spectra leads to a case-by-case analysis with arbitrary

    spectral preferences. However, the essential property these conditions point to is that the

    2x2 minors of(k) must be zero (i.e., (k) must be rank-1).

    One case from [58] is explained in detail in [20]. They fix a color space, a space

  • 8/11/2019 Ham Thesis

    40/192

    Chapter 2: A Basis for Color Constancy 29

    0

    0

    0

    0

    0

    0

    **

    *

    0

    0

    0

    0

    0

    0

    **

    *

    0

    0

    0

    0

    0

    0

    **

    *

    0

    0

    0

    0

    0

    0

    **

    *

    j

    i

    k

    Figure 2.3: The rows of a single (1)slice are placed into a new measurement tensor (rows

    are laid horizontally above) with all other entries set to zero. The marks the nonzeroentries.

    of material spectra, and a single reference illumination spectrum. They can then solve for

    the unique space of illumination spectra that includes the reference illuminant and is von

    Kries compatible (in the fixed color basis) with the given material space.

    In our framework, this can be interpreted as follows. The given input gives rise to

    a single (1) measurement slice. The three rows of this slice can be pulled out and placed

    in a new measurement tensor of the form shown in Figure 2.3. This measurement tensor is

    then padded with an infinite number of zero (i)matrices. The (k) slices of this new tensor

    are clearly rank-1 matrices, and thus this tensor is von Kries compatible in the given color

    space. Moreover, any measurement tensor with rank-1(k) that include the original (1)

    slice in its span must have (i) slices that are spanned by the (i)slices in Figure 2.3. With

    this fixed tensor and the fixed material spectra, one can then solve Equation (2.1) to obtain

    the space of compatible illumination spectra. This space can be described by three non-

    black illuminants and an infinite number of black illuminants (giving zero measurements

    for the input material space). Since the original(1) measurement slice is in the span of

    the (i)slices, the original reference illuminant must be in the solution space.

  • 8/11/2019 Ham Thesis

    41/192

    Chapter 2: A Basis for Color Constancy 30

    2.5 Results

    We used the SFU color constancy dataset to create a measurement tensor to use in our ex-

    periments. The SFU database provides 8 illuminants simulating daylight, and 1,995 mate-

    rials including measured spectra of natural objects. Fluorescent spectra were removed from

    the dataset in hopes of better modeling natural lighting conditions since, in color matching

    experiments, fluorescent lamps cause unacceptable mismatches of colored materials that

    are supposed to match under daylight [60].

    Color matching functions were taken to be CIE 1931 2-deg XYZ with Judd 1951

    and Vos 1978 modifications [60]. To resolve mismatches in spectral sampling, we inter-

    polated data using linear reconstruction. Cone fundamentals were taken to be the Vos and

    Walraven (1971) fundamentals [60]. Experiments were run with illuminant spectra nor-

    malized with respect to the L2 norm.

    We followed the strategy of Section 2.3 to produce optimized color spaces for

    von Kries color constancy.

    2.5.1 Effective Rank of the World

    To test the effective rank of our world (as measured by available datasets), we approxi-

    mate the SFU measurement tensor with tensors of varying rank and see where the dropoff

    in approximation error occurs. Figure 2.4 shows the results from the experiment. We mea-

    sured error in CIEDE2000 units in an effort to match human perceptual error. CIEDE2000

    is the latest CIE standard for computing perceptual distance and gives the best fit of any

  • 8/11/2019 Ham Thesis

    42/192

    Chapter 2: A Basis for Color Constancy 31

    RMS Error of SFU Tensor Approximation

    0

    5

    10

    15

    20

    25

    1 2 3 4

    approximating tensor's rank

    colorRMSerror(CIEDE2000)

    Figure 2.4: The effective rank of the SFU dataset is about 3. The green dot marks the error

    for the rank 3 tensor in which the color gamut was constrained to include the entire set of

    human-visible colors (see Section 2.5.5).

    method to the perceptual distance datasets used by the CIE [40]. As a rule of thumb, 1

    CIEDE2000 unit corresponds to about 1 or 2 just noticeable differences.

    The red curve in Figure 2.4 shows that the SFU measurement tensor is already

    quite well approximated by a rank 3 tensor. The rank 2 tensors approximation error is

    68.3% of the rank 1 tensors approximation error. The rank 3 tensors approximation error

    is 6.7% of the rank 2 tensors approximation error. The rank 4 tensors approximation error

    is 31.3% of the rank 3 tensors approximation error. We discuss the meaning of the green

    dot later in Section 2.5.5.

    2.5.2 Von Kries Sensors

    The matrix mapping XYZ coordinates to the new color coordinates, as computed using the

    TALS optimization on the SFU database, is given by the following (where the rows have

  • 8/11/2019 Ham Thesis

    43/192

    Chapter 2: A Basis for Color Constancy 32

    ANLS Effective Response

    0.0

    0.2

    0.4

    0.6

    0.8

    1.0

    380 430 480 530 580 630 680

    wavelength (nm)

    normalizedresponse

    Cone Response

    0.0

    0.2

    0.4

    0.6

    0.8

    1.0

    380 430 480 530 580 630 680

    wavelength (nm)

    normalizedre

    sponse

    CIE XYZ Response

    0.0

    0.2

    0.4

    0.6

    0.8

    1.0

    380 430 480 530 580 630 680

    wavelength (nm)

    normalizedre

    sponse

    TALS Effective Response

    -0.5

    0.0

    0.5

    1.0

    380 430 480 530 580 630 680

    wavelength (nm)

    normalizedresponse

    Figure 2.5: Color matching functions. Top row shows the standard CIE XYZ and Cone

    matching functions. The bottom row shows the effective sensor matching functions result-

    ing from applying the Cmatrix derived from optimizations on the SFU database. ANLS

    (alternating nonlinear least squares) constrains the color space gamut to contain the human-

    visible gamut and measures perceptual error using CIEDE2000.

    been normalized to unit length):

    C1 =

    9.375111101 3.197979101 1.371214101

    4.783729101 8.779015101 2.117135102

    8.338662102 1.235156101 9.888329101

    . (2.9)

    We ran our algorithm on the Joensuu database as well [45]. The Joensuu database provides

    22 daylight spectra and 219 natural material spectra (mostly flowers and leaves). The re-

    sulting basis vectors (columns ofC) were within a couple degrees (in XYZ space) of the

    SFU optimized basis vectors. This seems to suggest some amount of stability in the result.

    The effective sensor matching functions given by the optimized color basis are

    shown in Figure 2.5. As is intuitively suspected, the optimized basis causes a sharpening

  • 8/11/2019 Ham Thesis

    44/192

    Chapter 2: A Basis for Color Constancy 33

    in the peaks of the matching functions. This expectation is motivated by the fact that the

    diagonal model is exact for disjoint sensor responses. The ANLS result will be discussed

    in Section 2.5.5.

    2.5.3 White Patch Normalization

    There are many experiments one might devise to measure the color constancy afforded by

    various color bases. We chose to replicate a procedure commonly used in the literature.

    This experiment is based on a white patch normalization algorithm, and is described

    below.

    Dataset and algorithms

    We ran our color basis algorithm on the SFU dataset [7] and compared our resulting color

    basis against previous choices (the cone sensor basis, 4 bases derived from different low

    dimensional approximations of spectral data, that of Barnard et. al. [6], and the sensor

    sharpened basis [23]).

    The low dimensional worlds to which we compare are taken to have either 3

    dimensional illuminant spaces and 2 dimensional material spaces (a 3-2 world) or vice

    versa (a 2-3 world); this allows computing color bases via the procedure in [21].

    We took two different approaches to approximating spectra with low dimensional

    vector spaces. In the first approach (described in [21, 23]), we run SVD on the illuminant

    and material spectra separately. We then save the best rank-3 and rank-2 approximations.

  • 8/11/2019 Ham Thesis

    45/192

    Chapter 2: A Basis for Color Constancy 34

    This is Finlaysons perfect sharpening method for databases with multiple lights [23],

    and is one of the algorithms that falls under the label of spectral sharpening.

    As pointed out in [42], if error is to be measured in sensor space, there are alter-

    natives to running PCA on spectra. Given a measurement tensor, the alternative (tensor-

    based) approach instead applies SVD on the tensor flattenings MJ3I and M3IJ to get the

    principal combination coefficients of the spectral bases (to be solved for) that approximate

    the sample spectra. Refer to [42] for details.

    We label the algorithm of Barnard et. al. [6] as Barnard. This is a more recent

    algorithm and also falls under the label of spectral sharpening.

    We also ran experiments against Finlaysons sensor sharpening algorithm. Note

    that this algorithm does not actually use the database in determining the sensor transforms.

    Its heuristic is simply to transform the sensors so that the responses are as sharp as pos-

    sible.

    We also tested against a modified version of Finlaysons database sharpening

    method [23]. This algorithm as stated is defined in the case when the database has two

    illuminants and possibly many materials. Since our database has multiple lights, we used

    PCA on the set of illuminant spectra and ran the algorithm using the two most dominant

    principal components as our two illuminants. The results were nearly identical to both of

    the 2-3 methods, and we therefore omit the corresponding curves from the graphs.

  • 8/11/2019 Ham Thesis

    46/192

    Chapter 2: A Basis for Color Constancy 35

    Experimental procedure

    We run the same white-patch normalization experiment as in [21]. As input, we are given

    a chosen white material Wand an illuminant E. For the SFU database, we used the only

    material labeled as white as our white material W. For every other materialR, we compute

    a descriptor by dividing each of its 3 observed color coordinates by the 3 color coordinates

    of W (the resulting 3 ratios are then transformed as a color vector to XYZ coordinates so

    that consistent comparisons can be made with different choices of color space). In a von

    Kries world, the descriptor for R would not depend on the illuminant E. To measure the

    non von Kries-ness of a world, we can look at how much these descriptors vary with the

    choice ofE.

    More formally, we define the desriptor as:

    d

    W,R

    E = C

    diag

    C1

    p

    W,E1C

    1

    p

    R,E

    (2.10)

    The functiondiagcreates a matrix whose diagonal elements are the given vectors compo-

    nents.Cis a color basis. C,pR,E,pW,E are given in the CIE XYZ coordinate system.

    This means we compute the color vectorspW,E andpR,E as:

    (pW,E

    )k

    :=

    k

    ()E()W()d (2.11)

    (pR,E)k :=

    k()E()R()d (2.12)

    with 1() = x() =CIE response function forX, 2() = y() =CIE response function

  • 8/11/2019 Ham Thesis

    47/192

    Chapter 2: A Basis for Color Constancy 36

    forY, and 3() = z() = CIE response function for Z. The columns of the matrix Care

    the (normalized) basis vectors of a new color space in XYZ coordinates. C1 is the inverse

    ofC. This means that given some color vector v in XYZ coordinates,C1vcomputes the

    coordinates of the same color vector expressed in the new color spaces coordinate system.

    To compute a non von Kries-ness error, we fix a canonical illuminantEand com-

    pute descriptors dW,R

    E for every test materialR. We then choose some different illuminant

    Eand again compute descriptors dW,R

    E for every test materialR. Errors for every choice of

    EandR are computed as:

    Error=100 ||d

    W,RE dW,RE ||||dW,R

    E ||(2.13)

    For each instance of the experiment, we choose one SFU test illuminant E and

    compute the errors over all test materials (canonical illuminant E is kept the same for

    each experimental instance). Each time, the color basis derived from running our method

    (labeled as Optimized) performed the best.

    To visualize the results of the experimental test phase, we plot histograms of %

    color vectors that are mapped correctly with a diagonal transform versus the % allowable er-

    ror. Hence, each histogram plot requires the specification of two illuminantsone canonical

    and one test. Figure 2.6 shows the cumulative histograms for instances in which the stated

    basis performs the best and worst relative to the next best basis. Relative performance be-

    tween two bases is measured as a ratio of the areas under their respective histogram curves.

    The entire process is then repeated for another canonicalE to give a total of 4 graphs.

  • 8/11/2019 Ham Thesis

    48/192

    Chapter 2: A Basis for Color Constancy 37

    Figure 2.6: Percent vectors satisfying von Kries mapping versus percent allowable error.

    Each curve represents a different choice of color space. For low dimensional worlds, the

    dimension of the illuminant space precedes the dimension of the material space in the

    abbreviated notation. Low dimensional approximations were obtained either by runningPCA on spectra or by tensor methods described in text. We show the experimental instances

    in which our derived basis performs the best and worst relative to the next best basis. The

    left and right halves differ in choice of canonical illuminant for testing. Unlike Barnard,

    our method effectively optimizes all pairs of lights.

    White patch normalization results

    The optimization algorithm labeled as Barnard requires specification of a canonical illu-

    minant and chooses a color basis such that the mapping between any other test illuminant

    and the canonical one is as diagonal as possible (in a least squares sense).

    To be fair, we show two sets of results: one where the canonical illuminant during

    test matches the canonical illuminant used in Barnard optimization; and one set where a

    different canonical illuminant is chosen during test from that used in Barnard optimization.

    The second canonical illuminant is chosen to illustrate the best worst-case relative perfor-

    mance of our algorithm. Goodness is measured as the ratio of areas under the histogram

    curves.

    Barnards algorithm performs close to ours for some pairings of test and canoni-

  • 8/11/2019 Ham Thesis

    49/192

    Chapter 2: A Basis for Color Constancy 38

    cal illuminants, but is outperformed in most cases. These results are explained theoretically

    by the fact that, even though it optimizes over multiple light-pairs, the method of Barnard

    et al. always optimizes with respect to a single canonical illuminant. In contrast, we effec-

    tively optimize over all possible pairs of lights.

    As noted earlier, we also tested against Finlaysons database sharpening method

    [23] (using PCA on lights to handle multiple lights). The results were nearly identical to

    both of the 2-3 methods.

    2.5.4 White Balancing

    In this section, we discuss a rough perceptual validation we performed. We first down-

    loaded a hyperspectral image from the database provided by [25]. Each pixel of the hyper-

    spectral image gives a discretized set of samples for the per-wavelength reflectance function

    of the material seen at that pixel. The particular image we used was that of a flower shown

    in Figure 2.7. Assuming a lighting model in which viewed rays are simply computed as the

    product of the illuminant and reflectance spectra in the wavelength domain, we rendered the

    scene as it would appear under a tungsten light bulb and as it would appear under daylight.

    For visualization we converted all images into RGB coordinates and gamma corrected each

    coordinate before display. The goal of the algorithm was to start with the input image of

    the flower under a tungsten light and transform it into the target image of the flower under

    daylight using a generalized diagonal model. The goodness of a color basis is judged by the

    degree to which it facilitates the transformation of the input into the desired output (with

    the significance of discrepancies determined by human perceptual error).

  • 8/11/2019 Ham Thesis

    50/192

    Chapter 2: A Basis for Color Constancy 39

    (a) Input. (b) Target.

    (c) Cone. (d) 2-3 Tensor. (e) Ours.

    Figure 2.7: Color correction example. Diagonal mappings are used in three different color

    spaces to attempt a mapping of the input image (a) to the target illumination whose truth

    value is shown in (b).

    We ran three different algorithms (Cone, 2-3 tensor, and Ours) on the SFU database

    to derive three optimized color bases for diagonal color constancy. Instead of choosing one

    particular algorithm for then computing diagonals, we wanted to characterize some notion

    of best achievable performance under each basis. We simulated the best learned di-

    agonals with the following steps: transform the SFU derived measurement tensor by C1

    to obtain color coordinates in the candidate color basis; then for each material, compute

    the diagonal matrix mapping the color coordinates under the tungsten light bulb to the

    corresponding material color under daylight illumination (just component-wise ratios); fi-

    nally, average the diagonal matrices over all materials to get the overall diagonal mapping

    between the two illuminants under the candidate color basis.

  • 8/11/2019 Ham Thesis

    51/192

    Chapter 2: A Basis for Color Constancy 40

    Figure 2.7 shows the output images for the three different color bases. The dif-

    ferences are subtle, but one can see that the cone basis yields leaves that are too green, and

    the 2-3 tensor method yields a flower that has too much blue (not enough of the red and

    yellow of the flower is present). Our method provides the best match to the ideal target

    image.

    2.5.5 Constrained Optimization

    For miscellaneous reasons, we may also want to constrain our optimized solution in other

    ways. For example, in Chapter 3 we seek a color basis in which the implied color gamut

    (collection of colors that have only positive components in the color basis) encompasses

    the entire set of human-visible colors. There, we also choose to measure perceptual error

    using the CIEDE2000 difference equation instead of using the standard 2 error in a linear

    color space. These extra constraints make each step of the alternating process of Section

    2.3 a nonlinear least squares optimization. To handle the numerics, we used the Levenberg-

    Marquardt nonlinear least squares algorithm as implemented by [38] along with a wrapper

    function that enforces constraints.

    The green dot in Figure Figure 2.4 represents the approximation error for the

    rank 3 tensor in which the color basis was constrained such that the implied color gamut

    would include the entire set of human-visible colors. If we use the green dot in place of

    the red dot for the rank 3 approximation, we still get a good approximation of the SFU

    tensor. The constrained rank 3 tensors approximation error is 10.7% of the rank 2 tensors

    approximation error. The rank 4 tensors approximation error is 19.5% of the constrained

  • 8/11/2019 Ham Thesis

    52/192

    Chapter 2: A Basis for Color Constancy 41

    rank 3 tensors approximation error. We do not plot green dots for other tensor ranks

    because the gamut constraint is not so well-defined in those cases.

    The color basis transform resulting from the constrained optimization is given by

    the following (which maps XYZ color vectors to the new color space):

    C1 =

    9.465229101 2.946927101 1.313419101

    1.179179101 9.929960101 7.371554103

    9.230461102

    4.645794102

    9.946464101

    (2.14)

    See Figure 2.5 for the effective von Kries sensors, which are labeled as Alternating Non-

    linear Least Squares (ANLS).

    In Chapter 3, we are particularly interested in the ability of the diagonal model to

    describe the effect of illumination change in the constrained color basis (whose gamut in-

    clues all the human-visible colors). Unlike the white patch normalization experiment which

    measured some notion of relational color constancy, we would like to directly measure the

    goodness of diagonal color constancy itself. To do this, we simulate a perfect white bal-

    ancing algorithm that perfectly maps the color of a standard white material under one

    illuminant to the color of the same standard material seen under a different illuminant.

    In a von Kries compatible world, the same diagonal used to perform this mapping would

    correctly map all material colors under the initial illuminant to their corresponding colors

    under the second illuminant. We therefore apply the same mapping to each of the material

    colors under the first illuminant to derive predicted colors for the second illuminant. De-

    viations from the actual measured colors under the second illuminant then give a measure

  • 8/11/2019 Ham Thesis

    53/192

    Chapter 2: A Basis for Color Constancy 42

    Median and RMS Diagonal Color Constancy Error for Constrained Color Basis

    Tung. Tung.+ 3500K 3500K+ 4100K 4100K+ 4700K 4700K+

    Tung. 1.67 0.54 2.23 1.14 2.72 1.74 3.202.28 0.84 2.98 1.54 3.63 2.31 4.34

    Tung.+ 1.46 1.07 0.54 0.60 1.02 0.31 1.54

    2.32 1.69 0.80 0.97 1.40 0.55 2.11

    3500K 0.51 1.13 1.68 0.57 2.16 1.19 2.67

    0.85 1.60 2.24 0.79 2.92 1.59 3.65

    3500K+ 1.88 0.52 1.53 1.07 0.58 0.66 1.14

    2.97 0.80 2.32 1.67 0.77 1.09 1.55

    4100K 0.96 0.60 0.52 1.11 1.60 0.61 2.10

    1.53 0.88 0.82 1.53 2.18 0.84 2.89

    4100K+

    2.35 1.04 2.05 0.60 1.62 1.14 0.58

    3.58 1.44 3.01 0.79 2.39 1.71 0.81

    4700K 1.41 0.29 1.05 0.63 0.58 1.10 1.59

    2.17 0.49 1.55 0.96 0.82 1.51 2.17

    4700K+ 2.84 1.60 2.57 1.23 2.18 0.61 1.70

    4.24 2.18 3.77 1.58 3.16 0.80 2.43

    Table 2.1: Results from the white balancing experiment. Top error is median error, bottom

    error is RMS error. Errors are measured in CIEDE2000 units. Row illuminant is the one

    that is chosen as canonical. The illuminants are taken from the SFU dataset: Tung. is

    a basic tungsten bulb (Sylvania 50MR16Q 12VDC); the different temperatures correspond

    to Solux lamps of the marked temperatures; the + means that a Roscolux 3202 Full Blue

    filter has been applied.

    of the non-von Kries-ness of the world. This experiment also has the advantage of allow-

    ing error to be measured via the CIEDE2000 difference equation, giving some perceptual

    meaning to the resulting numbers. This also allows us to characterize performance in some

    absolute sense.

    More specifically, we use the following procedure to report results. We choose

    some illuminant as a canonical illuminant. We also choose the one material in the SFU

    database labeled as white for our standard white material. Then, for every other (test)

    illuminant, we compute the mapping that perfectly maps the standard white under the

  • 8/11/2019 Ham Thesis

    54/192

    Chapter 2: A Basis for Color Constancy 43

    canonical illuminant to its color under the test illuminant and is diagonal in our constrained

    color basis. We then apply this mapping to all the material colors under the canonical illu-

    minant to generate predictions for colors under the test illuminant. Errors between predic-

    tions and actual measured values in the SFU database are measured using the CIEDE2000

    difference equation. This gives us, for each test illuminant, an error value associated with

    every material. So for each test illuminant, we report the median CIEDE2000 error and the

    root-mean-squared CIEDE2000 error of the material colors. See Figure 2.1 for the results.

    2.6 Discussion

    We have argued for a new data-driven choice of color basis for diagonal color constancy

    computations. We show that with respect to some existing metrics, the new choice leads to

    a better diagonal model.

    While a linear change of color basis poses no problem to those concerned sim-

    ply with algorithmic modeling, those who seek relevance to human biological mechanisms

    might object (on theoretical grounds) that sensor measurement acquisition may involve

    nonlinearities that disrupt the brains ability to linearly transform the color basis down-

    stream. Fortunately, experimental results based on single-cell responses and psychophysi-

    cal sensitivity suggest that any existing nonlinearities at this level are negligible [41, 57].

  • 8/11/2019 Ham Thesis

    55/192

    Chapter 3

    A New Color Space for Image Processing

    In this chapter, we motivate the need for a new color space primarily by the desire to

    perform illumination-invariant image processing, in which algorithm outputs are not so

    sensitive to the illumination conditions of their input images. Simultaneously, we also

    seek a color space in which perceptual distances can be computed easily. We show that

    these desires relate very naturally to notions in perceptual science, andwith one additional

    assumptionfully lock down the form of the color space parameterization. We fit the re-

    maining parameters to experimental data and apply our new color space to some examples

    to illustrate its utility.

    44

  • 8/11/2019 Ham Thesis

    56/192

    Chapter 3: A New Color Space for Image Processing 45

    3.1 Introduction

    While existing color spaces address a range of needs, none of them simultaneously capture

    two notable properties required by a large class of applications that includes segmentation

    and Poisson image editing [48]. In this work, we present a new color space designed

    specifically to address this deficiency.

    We propose the following two color space desiderata for image processing:

    (1) difference vectors between color pixels are unchanged by re-illumination;

    (2) the2 norm of a difference vector matches the perceptual distance between the two

    colors.

    The first objective restricts our attention to three-dimensional color space parameteriza-

    tions in which color displacements, or gradients, can be computed simply as component-

    wise subtractions. Furthermore, it expresses the desire for these color displacementsthe

    most common relational quantities between pixels in image processingto be invariant to

    changes in the spectrum of the scene illuminant. Illumination invariance is useful for ap-

    plications where processing is intended to operate on intrinsic scene properties instead of

    intensities observed under one particular illuminant. For example, Figure 3.3 (page 58)

    shows that when segmenting an image using usual color spaces, images that differ only in

    the scenes illumination during capture can require nontrivial parameter tweaking before

    the resulting segmentations are clean and consistent. And even then, the segmentations are

    not necessarily reliable. Figure 3.4 (page 61) illustrates the extreme sensitivity a Poisson

    image editing algorithm exhibits when the illuminant of the foreground object does not

  • 8/11/2019 Ham Thesis

    57/192

    Chapter 3: A New Color Space for Image Processing 46

    match the backgrounds illumination.

    The second condition implies that the standard computational method of measur-

    ing error and distance in color space should match the perceptual metric used by human

    viewers.

    These desiderata have direct correspondence to widely-studied perceptual no-

    tions that possess some experimental support. Desideratum (1) corresponds to subtractive

    mechanisms in color image processing and to human color constancy. Desideratum (2)

    relates to the approximate flatness of perceptual space.

    Subtractive mechanisms refer to the notion that humans perform spatial color

    comparisons by employing independent processing per channel [33, 43] and that such per-

    channel comparisons take a subtractive form. Physiological evidence for subtraction comes

    from experiments such as those revealing lateral inhibition in the retina [18] and the exis-

    tence of double opponent cells in the visual cortex (where each type provides a spatially

    opponent mechanism for comparing a select chromatic channel) [46].

    Color constancy is described in Section 1.1.1 and analyzed in detail in Chapter 2.

    Sometimes the term chromatic adaptation is used instead to emphasize the inability to

    achieve perfect constancy [18].

    The approximate flatness of perceptual space refers to the relative empirical

    successas