Optimal Electric Field Estimation and Control for Coronagraphy

Optimal Electric Field Estimation and

Control for Coronagraphy

Tyler D. Groff

A Dissertation

Presented to the Faculty

of Princeton University

in Candidacy for the Degree

of Doctor of Philosophy

Recommended for Acceptance

by the Department of

Mechanical and Aerospace Engineering

Adviser: Dr. N. Jeremy Kasdin

September 2012

Abstract

Detecting and characterizing extrasolar planets has become a very relevant field in Astro-

physics. There are several methods to achieve this, but by far the most difficult and po-

tentially most rewarding approach is direct imaging of the planets. Coronagraphs can be

used to image the area surrounding a star with sufficient contrast to detect orbiting planets.

However, coronagraphs exhibit an extreme sensitivity to optical aberrations which causes

starlight to leak into the search area. To solve this problem we use deformable mirrors to

correct the field, recovering a small search area of high contrast (commonly referred to as a

”dark hole”) where we can once again search for planets.

These coronagraphs require focal plane wavefront control techniques to achieve the nec-

essary contrast levels. These correction algorithms are iterative and the control methods

require an estimate of the electric field at the science camera, which requires nearly all of

the images taken for the correction. In order to maximize science time the amount of time

required for correction must be minimized, which means reducing the number of exposures

required for correction. Given the large number of images required for estimation, the ideal

choice is to use fewer exposures to estimate the electric field. With a more efficient monochro-

matic estimation in hand, we also seek to apply this correction over as broad a bandwidth

as possible. This allows us to spectrally characterize a target without having to repair the

field for every wavelength.

This thesis derives and demonstrates an optimal estimator that uses prior knowledge to

create the estimate of the electric field. In this way we can optimally estimate the electric field

by minimizing the number of exposures required to estimate under an error constraint. With

an optimal estimator in place for monochromatic light, we also demonstrate a controller that

can suppress the field over a bandwidth when provided with this monochromatic estimate.

The challenges, current levels of performance, and future directions of this work are discussed

in detail.

iii

Acknowledgements

Many people have contributed towards my successes in life over a very long period of time.

First, I thank my adviser, Dr. N. Jeremy Kasdin. You have been an excellent mentor over

the years and I consider you a great friend. Thank you for trusting me to use your laboratory,

with a resource like that it is hard not to be successful. Your enthusiasm for this work has

kept me engaged and made working with you very enjoyable. I greatly appreciate the time

I have spent debating and discussing all technical aspects of our work. I have learned a lot

from you and it has most certainly left a great impression.

I also thank my committee and readers, Drs. David Spergel, Robert Vanderbei, Mike

Littman, and Craig Arnold. Your continued advice over the past five years has been greatly

appreciated, as has the unique perspective each of you has taken on the work I present

here. I will always be grateful for your continuous presence and willingness to discuss any

challenge that I have faced. In addition to my adviser and committee, the faculty here at

Princeton have been incredibly supportive. Dr. Dick Miles’ continued support and interest

in my graduate career has been to my great benefit. Much of my understanding of optics

can be credited to him. Dr. Robert Stengel has taught me a great deal about estimation and

control, and I have very much appreciated his ongoing interest in my research. I also thank

Ed Turner for providing so many opportunities to work at the Subaru telescope, which has

played a substantial role in my career. I also thank the support staff in the department,

particularly Jessica O’Leary, Jill Ray, and Candy Reed, who always seem to be able to solve

any problem.

The post-docs in our group over my time have been limited to Mike McElwain and Alexis

Carlotti. I have enjoyed working with both of you, and look forward to more. In addition

to great friendship, I owe a debt of gratitude to the older students from my research group.

Drs. Amir Give’on, Laurent Pueyo, Jason Kay, Eric Cady, and Dmitry Savransky have all

contributed to my understanding and development in this field. Amir, we did not overlap at

Princeton but your continued presence at JPL has afforded us the opportunity to compare

iv

notes and work together several times. Laurent, I love getting to banter about math and

wavefront control with you. I always walk away knowing more, and I look forward to seeing

you whenever I can. Jason Kay, it feels like an eternity since I walked in the door and you

taught me everything about the lab, and I miss the loud banter and hatred for equipment

crashes. I will never forget our train ride to Boston. Eric Cady, I value all of our work

and conversations together and I am happy to see you so often as a colleague and friend.

Best enabler ever. Dmitry Savransky, I also appreciate our continued friendship and I am

grateful to see you on a regular basis. Thanks for tolerating my general impatience with

computers. Our symbiotic approach to understanding optics and computers is sorely missed.

To the younger students in our research group, it’s been fun having you around. A J, you

are picking up all the little quirks in the lab quickly and you have made my job much easier.

It’s been nice working with someone in there again and I’m glad that isn’t over.

One nice aspect of this field is its closeness and friendliness. As a result I have spent

a great deal of time with many individuals outside of Princeton and I consider them my

extended research group. Drs. Olivier Guyon, Ruslan Belikov, Remi Soummer, Frantz

Martinache, and Stuart Shaklan in particular have all made significant contributions to the

quality of the work presented here, and have given me much to think about. They have

been very generous in lending advice, providing thoughtful conversation, and have quickly

become people I consider to be very good friends.

I have many friends from my time at Princeton that are not part of my group; Mac Haas,

Mike Burke, Andy Stewart, Josh Proctor, James Michael, Will Larrison, Katie Quaranta,

my fellow fifth years, the bonfire attendees, the softball team, and rock climbing crews have

all made my time at Princeton quite enjoyable.

From my college years at Tufts University I would like to thank the ME faculty there,

particularly my advisers Dr. Gary Leisk and Dr. Robert White. They have gone above and

beyond, and I am glad to stay in contact with them to this day. I also would like to thank the

folks at DFM Engineering, where I truly learned how to do mechanical design and learned

v

to love building telescopes. They are truly a family, and I have learned so much from them.

I especially thank Dr. Frank Melsheimer and Kate Melsheimer for opening their home to

me and taking me under their wing. I also thank the Astrogeeks of OELS. Steve Lee, Dave

Olson, Ben Reed, and Poti Doukas have been a constant point of support. Steve, Dave,

and Ben have been some of the most important and constant mentors in my entire life, and

Poti quickly joined their ranks. I also thank all Astrogeeks and the entirety of the Outdoor

Education Laboratory Schools. It was through this program that I discovered astronomy,

and my continued participation has helped keep my wonder of the universe alive. It is in

this spirit that I consider the work presented here to merely be part of a life series entitled

“I Wonder....”

I end by thanking my wonderful family. My Grandparents Roy, Sally, Shirley, Jim, Dean.

My Parents Dean and Lauri. Your unwavering support through my entire life has gotten me

to this point, and I could not have done it without you. My Brother Shawn. Thank you

for serving our country, particularly in a time of war. Keep throwing rocks down hills, just

make sure nobody is at the bottom...To the rest of my family, aunts and uncles and in-laws

I thank you for your support as well, and for adding richness to my life.

My beautiful and intelligent wife, Kimberley. Your support has been unparalleled by

anyone. Your constancy, kindness, and intelligence have made my life (and this thesis) so

much better. I like hanging out with you, and I want you to know that you are the love of

my life.

None of the work would have been possible without substantial NASA funding. This

work was funded by NASA Grant #NNX09AB96G and the NASA Earth and Space Science

Graduate Fellowship.

This dissertation carries the number T-3243 in the records of the Department of Mechanical

and Aerospace Engineering.

vi

To the love of my life, Kimberley.

This would all be meaningless without you.

vii

Contents

Abstract . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . iii

Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . iv

List of Tables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xi

List of Figures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xii

List of Symbols . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xvi

Notation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xvii

1 Introduction 1

1.1 Science Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1

1.2 Coronagraphs and Wavefront Control . . . . . . . . . . . . . . . . . . . . . . 3

1.3 The High contrast Imaging Laboratory . . . . . . . . . . . . . . . . . . . . . 6

1.4 Two Deformable Mirrors in Series . . . . . . . . . . . . . . . . . . . . . . . . 10

1.5 Fourier Optics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12

1.5.1 Propagation: Fresnel Transform . . . . . . . . . . . . . . . . . . . . . 14

1.5.2 Imaging: Fourier Transform . . . . . . . . . . . . . . . . . . . . . . . 16

1.6 Controllability of Amplitude and Phase . . . . . . . . . . . . . . . . . . . . . 20

1.6.1 Pupil Plane Controllability: Angular Spectrum . . . . . . . . . . . . 21

1.6.2 Image Plane Controllability: The Propagation Factor . . . . . . . . . 25

1.7 Numerical Transform . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33

1.8 Thesis Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39

1.9 Chapter Assumptions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40

viii

2 Focal Plane Wavefront Control 42

2.1 Monochromatic Wavefront Control . . . . . . . . . . . . . . . . . . . . . . . 43

2.2 Wavelength Dependence of the Image Plane . . . . . . . . . . . . . . . . . . 53

2.3 Continuous Bandwidth Constraint . . . . . . . . . . . . . . . . . . . . . . . . 55

2.4 Windowed Stroke Minimization . . . . . . . . . . . . . . . . . . . . . . . . . 57

2.5 Extrapolating Estimates in Wavelength . . . . . . . . . . . . . . . . . . . . . 61


3 Batch Process Electric Field Estimation 68

3.1 Linearity of the Electric Field . . . . . . . . . . . . . . . . . . . . . . . . . . 69

3.2 Pairwise Images . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70

3.3 DM Diversity: Batch Process Estimation . . . . . . . . . . . . . . . . . . . . 71

3.4 Probe Shapes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74

4 Kalman Filter Estimation 77

4.1 Constructing the Optimal Filter . . . . . . . . . . . . . . . . . . . . . . . . . 78

4.2 Sensor and Process Noise . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 89

4.3 Iterative Kalman Filter . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 92

4.4 Optimal Probes: Using the Control Signal . . . . . . . . . . . . . . . . . . . 93


5 Laboratory Results 96

5.1 Monochromatic Performance . . . . . . . . . . . . . . . . . . . . . . . . . . . 96

5.1.1 DM Diversity Performance . . . . . . . . . . . . . . . . . . . . . . . . 97

5.1.2 Kalman Filter Performance . . . . . . . . . . . . . . . . . . . . . . . 98

5.2 Broadband Performance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 101

5.2.1 Prior to Single Mode Photonic Crystal Fiber . . . . . . . . . . . . . . 102

5.2.2 Photonic Crystal Single Mode Fiber Upgrade . . . . . . . . . . . . . 104

5.3 Final Remarks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 108

ix

6 Sources and Propagation of Error 115

6.1 Precision of a Contrast Measurement . . . . . . . . . . . . . . . . . . . . . . 116

6.2 Estimation Algorithms and Propagating Error . . . . . . . . . . . . . . . . . 120

6.3 Accuracy of Wavelength Extrapolation . . . . . . . . . . . . . . . . . . . . . 126

6.4 DM Controllable Space . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 127

6.5 Experiment Stability and Laser Power . . . . . . . . . . . . . . . . . . . . . 127

6.6 Stability of Laser Power . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 130

6.7 Final Remarks on Error . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 130

7 Conclusions and Future Directions 132

7.1 Parameter Adaptive Filtering . . . . . . . . . . . . . . . . . . . . . . . . . . 133

7.2 Dual Controller . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 133

7.3 Including Alternate Sensors . . . . . . . . . . . . . . . . . . . . . . . . . . . 134

7.3.1 Establishing a Reference . . . . . . . . . . . . . . . . . . . . . . . . . 136

7.3.2 Applying Reference to the Time Update . . . . . . . . . . . . . . . . 136

7.4 Bias Estimation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 137

7.5 Final Remarks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 137

Bibliography 139

x

List of Tables

1.1 Coordinates in each plane . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33

4.1 Definition of all Kalman Filter Matrices . . . . . . . . . . . . . . . . . . . . . 86

4.2 Definition of Kalman Filter Vectors . . . . . . . . . . . . . . . . . . . . . . . 88

xi

List of Figures

1.1 Telescope Diffraction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4

1.2 Atmospheric Adaptive Optics . . . . . . . . . . . . . . . . . . . . . . . . . . 5

1.3 HCIL Optical Layout . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7

1.4 Ideal vs. Aberrated PSF . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8

1.5 HCIL Filter Mechanism . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9

1.6 Single DM FPWC Experiments . . . . . . . . . . . . . . . . . . . . . . . . . 11

1.7 Light Propagation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12

1.8 Fourier Imaging Schematic . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18

1.9 Angular Spectrum Schematic . . . . . . . . . . . . . . . . . . . . . . . . . . 20

1.10 DM Nominal Shapes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26

1.11 Controllability of Amplitude Aberrations . . . . . . . . . . . . . . . . . . . . 30

1.12 Numerical Dimension of Planes . . . . . . . . . . . . . . . . . . . . . . . . . 38

2.1 DM Actuation Over Control History . . . . . . . . . . . . . . . . . . . . . . 51

2.2 Extrapolating Estimates in Wavelength . . . . . . . . . . . . . . . . . . . . . 65

4.1 Feedback Block Diagram . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77

4.2 Detector Noise . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 92

5.1 Monochromatic Correction with DM-Diversity . . . . . . . . . . . . . . . . . 97

5.2 Monochromatic Correction with Kalman Filter: 4 Image Pairs . . . . . . . . 98


xii


5.5 Monochromatic Correction with Kalman Filter: 1 Image Pair . . . . . . . . . 101

5.6 Broadband Correction: Pre-PCSM Extrapolated Results . . . . . . . . . . . 103

5.7 Broadband Correction: Pre-PCSM Extrapolate Individual Filters . . . . . . 110

5.8 Broadband Correction: Extrapolated Results . . . . . . . . . . . . . . . . . . 111

5.9 Broadband Correction: Extrapolated Estimate Individual Filters . . . . . . . 112

5.10 Broadband Correction: Direct Estimate Results . . . . . . . . . . . . . . . . 113

5.11 Broadband Correction: Direct Estimate Individual Filters . . . . . . . . . . . 114

6.1 Phase of Propagation Uncertainty . . . . . . . . . . . . . . . . . . . . . . . . 122

6.2 Interferometric Measurement of Superposition . . . . . . . . . . . . . . . . . 125

6.3 Interferometric Measurement of the Influence Function . . . . . . . . . . . . 125

6.4 Contrast Stability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 128

6.5 Performance Comparison . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 131

7.1 Sensor Schematic . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 135

xiii

List of Symbols

i Imaginary Number . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13

λ Wavelength . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13

E The complex valued electric field . . . . . . . . . . . . . . . . . . . . . . . 13

n Normal Vector, any vector normal to the Optical Axis . . . . . . . . . . . 13

q Point from which the field is known . . . . . . . . . . . . . . . . . . . . . 13

p Point of the unknown field . . . . . . . . . . . . . . . . . . . . . . . . . . 13

rp/q Vector from the unknown to the known field . . . . . . . . . . . . . . . . 13

Σ The Surface of Integration in the Rayleigh-Sommerfeld Integral . . . . . . 13

Σ′ The Surface of Integration in the Rayleigh-Sommerfeld Integral . . . . . . 13

S Differential Surface Unit . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13

z Free space propagation distance . . . . . . . . . . . . . . . . . . . . . . . 13

L{·} The lens operator . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16

f Focal length of an imaging optic . . . . . . . . . . . . . . . . . . . . . . . 17

ξ First coordinate in an Intermediate Plane . . . . . . . . . . . . . . . . . . 17

η Second coordinate in an Intermediate Plane . . . . . . . . . . . . . . . . . 17

u First coordinate in the Pupil Plane . . . . . . . . . . . . . . . . . . . . . . 18

v Second coordinate in the Pupil Plane . . . . . . . . . . . . . . . . . . . . 18

x First coordinate in the Image Plane . . . . . . . . . . . . . . . . . . . . . 18

y Second coordinate in the Image Plane . . . . . . . . . . . . . . . . . . . . 18

F{·} The Fourier Transform Operator . . . . . . . . . . . . . . . . . . . . . . . 18

xiv

A Amplitude Distribution, Typically at a Pupil Plane . . . . . . . . . . . . . 21

φ Phase Difference . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22

D The diameter of the pupil . . . . . . . . . . . . . . . . . . . . . . . . . . . 22

· Shorthand for the Fourier Transform . . . . . . . . . . . . . . . . . . . . . 34

pup Subscript Indicating the Pupil Plane . . . . . . . . . . . . . . . . . . . . . 44

g(u, v) Arbitrary Aberrated Field (Complex) . . . . . . . . . . . . . . . . . . . . 44

im Subscript Indicating the Image Plane . . . . . . . . . . . . . . . . . . . . 45

C{·} Arbitrary Linear Operator . . . . . . . . . . . . . . . . . . . . . . . . . . 45

I Intensity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45

DH Subscript Indicating the Dark Hole . . . . . . . . . . . . . . . . . . . . . . 45

IDH A Scalar, Average Dark Hole Intensity . . . . . . . . . . . . . . . . . . . . 45

< Real Part of a Complex Variable . . . . . . . . . . . . . . . . . . . . . . . 46

λ0 The Central Wavelength Being Estimated . . . . . . . . . . . . . . . . . . 47

aq Amplitude for a Single DM Actuator . . . . . . . . . . . . . . . . . . . . . 48

h Deformable Mirror Physical Height Change . . . . . . . . . . . . . . . . . 48

= Imaginary Part of a Complex Variable . . . . . . . . . . . . . . . . . . . . 48

u Vector of DM Actuation Signals . . . . . . . . . . . . . . . . . . . . . . . 48

M Matrix Mapping DM Actuation to Image Plane Intensity . . . . . . . . . 48

b Matrix Operator Mapping Deformable Mirror-Aberrated Field to Image

Plane Intensity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48

d Inner Product of the Aberrated Field . . . . . . . . . . . . . . . . . . . . 48

J Cost Function . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49

µ Lagrange Multiplier . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49

uopt The Optimal DM Command . . . . . . . . . . . . . . . . . . . . . . . . . 49

w(λ) Intensity Weight as a Function of Wavelength . . . . . . . . . . . . . . . . 54

∆λ Bandwidth [Meters] . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54

λ1 The Lower Bounding Wavelength . . . . . . . . . . . . . . . . . . . . . . . 57

xv

λ2 The Upper Bounding Wavelength . . . . . . . . . . . . . . . . . . . . . . 57

δ Scalar Weight on the Lagrange Multiplier . . . . . . . . . . . . . . . . . . 58

α Amplitude of the Aberrated Field at the Pupil . . . . . . . . . . . . . . . 62

β Phase of the Aberrated Field at the Pupil . . . . . . . . . . . . . . . . . . 62

z Noisy Sensor Measurement . . . . . . . . . . . . . . . . . . . . . . . . . . 72

x The Current State, or Electric Field . . . . . . . . . . . . . . . . . . . . . 72

n Sensor Noise . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72

H Observation Matrix . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73

k Current Discrete Point in Time in Estimation . . . . . . . . . . . . . . . . 78

P Covariance of the Electric Field . . . . . . . . . . . . . . . . . . . . . . . . 78

R Sensor Noise Covariance Matrix . . . . . . . . . . . . . . . . . . . . . . . 79

K Gain Matrix . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81

Φ Time Update in a Discrete Time Filter . . . . . . . . . . . . . . . . . . . 83

Γ Linear Propagation of Control to Image Plane Electric Field . . . . . . . . 83

w Process Noise . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 83

Λ Linear Propagation of Process Noise to Image Plane Electric Field . . . . 83

Q Process Noise Covariance Matrix . . . . . . . . . . . . . . . . . . . . . . . 85

xvi

Notation

• Search Area: The area in the image plane where the coronagraph has been designed

to produce high contrast for the detection of dim companions.

• Dark Hole: A region in the aberrated image where wavefront control has been used to

recover high contrast.

• Focal Plane Wavefront Control (Controller): This terminology refers to the control law

being used in the correction algorithm

• Focal Plane Wavefront Correction: This encompasses the entire algorithm used to

correct the wavefront, including the state estimator and control law.

• < ·, · > Is the Inner Product of any Matrix and is used to evaluate the Intensity, IΣ,

of the electric field in a given area of the image plane.

• Matrix Inner Product: an inner product that produces a scalar value, intended to

describe a scalar value of the intensity in the dark hole, IDH .

• Scalar Inner Product: an element by element inner product of each value in a vector,

intended to find the intensity distribution of the electric field in the dark hole, Iim(x, y).

xvii

Chapter 1

Introduction

1.1 Science Motivation

Detecting and characterizing extrasolar planets has become a major point of focus in the as-

trophysical community. Directly imaging solar systems opens up a parameter space unavail-

able to current indirect detection methods such as radial velocity and transit photometry.

These methods are highly successful, but biased towards large planets very close to their

parent star. Even with impressive advances in these detection limits, they are only sensitive

to orbits that intersect our line of sight with the parent star. Apart from astrometry, direct

detection is the only method sensitive to face on orbits that do not cross our line of sight. It

is also capable of taking reliable spectral measurements of the planet and does not require

an orbital fit to certify the planets existence. So long as we can efficiently detect planets

through direct imaging, we can dramatically increase the number of detectable systems.

Much like galactic astronomy, planetary science relies on a large number of observations of

many systems to build up our understanding of the time evolution in solar systems. Such

models that describe the formation of solar systems and their major orbiting bodies need

as much and as detailed a data set as possible. The increased parameter space and the

spectral information that direct imaging has to offer makes it a very compelling method for

1

this purpose.

Planets are generally classified as jovian or terrestrial bodies, which in our solar system

also correlates to their mass and proximity to the Sun. It is expected that the same mass

correlation will hold, making current detection methods more sensitive to jovian planets. In

fact, over 700 planets have been detected, but the vast majority are between 100 and 10,000

times the mass of Earth [1]. This is largely due to the fact that most detection methods are

more sensitive to higher mass planets with a small orbital period. These detection schemes

do not directly image the orbiting body, but measure the effect of the planet on the stellar

signal in the form of a periodic doppler shift or drop in intensity. The Kepler mission uses

transit photometry to detect orbiting bodies, and has been highly successful [13, 12]. It is

even capable of detecting Earth analogs (and has already come close)[4], but is incapable

of spectrally characterizing the planet in any way. A major disadvantage of the indirect

detection techniques is that they are fundamentally finding a best fit solution of periodic

data to a Keplerian orbit. This requires observations over at least one orbit to make a

detection, which means these observations are biased to high mass planets very close to

their parent star. Directly imaging a planet does not exhibit the same sensitivity to mass

(though reflected area does play a role in the intensity of the planet) and only requires

enough observations to rule out the possibility of the body being a background star. Direct

imaging also opens a new parameter space of orbits that are observed to be face on, which are

undetectable by the indirect techniques. Since we are gathering light directly from the planet

we can also directly measure the spectra of a planet, and directly observe the projection of

its separation from the parent star.

To spectrally characterize any detectable planet, we seek imaging methods that are ca-

pable of directly imaging Earth-analogs. The Terrestrial Planet Finder (TPF) telescopes in

the late 1990’s to early 2000’s were NASA’s original space-based concepts for such a mission.

One was an imaging interferometer with a satellite constellation to produce a long baseline,

commonly referred to as TPF-I [2]. The second was a coronagraphic imager based on a

2

4× 8 m elliptical mirror, referred to as TPF-C. These did not become funded missions, but

the science requirements developed for them are still used as a baseline for today’s mission

concepts. More recently, another concept has been developed where the telescope is flown in

formation with a second satellite mounted to a starshade, or occulter, designed to create a

diffraction limited shadow from the star, allowing the off-axis planet to pass by unobstructed

into the telescope [16, 71]. Simultaneously, more advanced coronagraphic imaging concepts

have been developed, making these the two leading concepts for a direct imaging mission

[67, 42, 38, 73, 33, 37, 15]. Each mission has its own set of challenges, but the main objective

of each is to mitigate the diffractive effects of the telescope’s finite aperture. In all likeli-

hood, a combined mission concept will maximize performance with regard to the number of

targets than can be detected and characterized, and will mitigate risk involved with the mis-

sion. Of the two missions, the coronagraph applies most broadly to both ground and space

instrumentation. This thesis focuses on the technology development for the coronagraph

concept.

1.2 Coronagraphs and Wavefront Control

The two primary obstacles to imaging very dim objects orbiting extremely close to their

parent star are diffraction effects from the telescope and aberrations to the field from im-

perfections in the optical system. The finite aperture of the telescope results in a diffraction

pattern that leaks starlight into the region where a dim companion would otherwise be seen.

As shown in Fig. 1.1, this is not an issue of angular resolution. Neglecting any errors, an

aperture larger than approximately 2 meters is capable of distinguishing two objects sepa-

rated by 1 astronomical unit (AU) within 10 parsecs of the Earth. Fig. 1.1 shows that it

is the relative intensity of the planet and the diffracted light that limits the detectability

of a planet. This typically limits the detectable intensity of a companion to roughly one or

two orders of magnitude dimmer than its parent star. Most generally, a coronagraph can

3

−5 0 5 10 15 20 2510

−15

10−10

10−5

100

λ/D

Norm

alizedIn

tensity

2 meter Telescope

−5 0 5 10 15 20 2510

−15

10−10

10−5

100

λ/D

Norm

alizedIn

tensity

10 meter Telescope

Figure 1.1: Intensity profile of an image from a circular aperture with diameter (a) 2 meters,and (b) 10 meters. The off-axis source with unitary amplitude (red-dashed line) indicatesthat the object is resolvable by either telescope. The off-axis source with a peak intensity10−10 lower than the star’s peak intensity (solid-green line) shows the relative power of anEarth like planet and the diffracted energy from the star. If we were to solve the diffractionproblem by making the telescope larger the aperture would have to be greater than 1 km indiameter.

be defined as an optical system contained within the telescope that modifies the diffraction

pattern imposed by the telescope’s finite aperture. By attenuating the diffracted light at

small angular separations, the coronagraph lowers the detection limit of a dim companion.

The degree of suppression is quantified as contrast, a dimensionless parameter in the image

plane. The contrast of any point in the image plane is defined as a fraction of the peak

power in the point spread function (PSF) of the unobstructed aperture. A region of high

contrast is commonly referred to as the search area and the closest point to the star’s centroid

that achieves the targeted contrast level is defined as the inner working angle (IWA) of the

coronagraph. As the designed contrast and IWA decrease, the coronagraph becomes more

difficult to manufacture and simultaneously becomes more sensitive to optical aberrations.

Since a coronagraph is fundamentally sensitive to perturbations in the incident field (e.g.

a second slightly off-axis source) it exhibits an extreme sensitivity to optical aberrations that

4

distort the field, as demonstrated in Fig.1.4. These can be aberrations that are incident on

the telescope (as is the case when considering atmospheric turbulence) or they can be due to

imperfections in the optical system itself. In either case we seek to correct these wavefront

errors using deformable mirrors (DMs), computer controlled mirrors with high precision

actuators bonded to the back surface. There are many DM technologies, with varying levels

of actuator stroke, density, and precision. In all cases the purpose is to produce achromatic

phase shifts that vary arbitrarily across the plane (limited by the number spacing of the

actuators).

Adaptive optics (AO) attempt to correct the atmospheric distortion incident on the

telescope. The challenges of atmospheric AO are in the speed of correction (≈ 100’s -

1000’s of Hz), correcting over a large field of view, and the potential for a high degree of

nonlinearity [53, 51]. As shown in Fig. 1.2, a typical AO system samples the beam going

Figure 1.2: Diagram of a generic ground-based atmospheric adaptive optics system. Thewavefront sensor and DM are capable of correcting high speed aberrations that appear asphase at the DM (commonly a pupil) plane. All residual static and quasi-static speckleuncontrollable by the AO system leaves a residual that must be corrected with focal planewavefront control techniques.

to the science instrument using a dichroic mirror to measure the field at the pupil plane

5

with some form of wavefront sensing device. For the purpose of this thesis we will consider

the atmospheric AO problem to be solved by an upstream AO system, or by considering a

space-based observatory. To correct imperfections in the optical system, we are trying to

repair the wavefront to a significantly higher degree of precision (since we are conceptually

trying to repair residual static errors). This generally allows us to take approximations that

make the problem linear, but the correction time is much slower. As a result, we require

common-path techniques that account for the aberrations that reach the image plane. This

thesis focuses on model based methods to estimate the electric field at the image plane

using only the science camera, and control laws based on the estimated electric field at the

image plane (rather than pupil plane) measurements. This estimation and control problem

is commonly referred to as focal plane wavefront correction (FPWC).

1.3 The High contrast Imaging Laboratory

The High Contrast Imaging Laboratory (HCIL) at Princeton tests coronagraphs and wave-

front control algorithms for quasi-static speckle suppression. The collimating optic is a six

inch off-axis parabola (OAP) followed by two first generation Boston Micromachines kilo-

DMs in series and a shaped pupil coronagraph, which is imaged with a second six inch OAP

(Figure 1.3). We use a shaped pupil coronagraph, shown in Figure 1.4(a), and described

in detail in Belikov et al.[5]. This coronagraph produces a discovery space with a theoreti-

cal contrast of 3.3 × 10−10 in two 90◦ regions as shown in Figure 1.4(b). At the Princeton

HCIL, the aberrations in the system result in an uncorrected average contrast of approxi-

mately 1 × 10−4 in the area immediately surrounding the core of the point spread function

(PSF), which agrees with the simulations shown in Figure 1.4(d). Since the coronagraph is

a binary mask, its contrast performance is fundamentally achromatic, subject only to the

physical scaling of the PSF with wavelength. The lab can be configured with either a 635 nm

monochromatic laser diode input, or a Koheras supercontinuum source. As shown in Fig. 1.5,

6

Figure 1.3: Optical ayout of the Princeton HCIL. Collimated light is incident on two DMsin series, which propagates through a Shaped Pupil, the core of the PSF is removed with animage plane mask, and the 90◦ search areas are reimaged on the final camera.

before the supercontinuum source is injected into the laboratory experiment, it is first col-

limated by a 90◦ off axis parabolic element designed specifically for collimating/coupling of

polychromatic light from a fiber. After the light is collimated it passes through a filter wheel

where a set of interference filters allows us to sample narrow bandwidths in a ∆λ/λ0 = 20%

range around λ0 = 635 nm. After the light passes through the filter wheel it is recoupled

with a second off axis element into a second fiber made by Koheras which is designed to be

continuously single mode over the entire visible and near-infrared spectrum. This allows us

to reproduce the wavelength nature of a light coming from a star, the importance of which

is discussed in Ch. 5. Since the collimating/coupling elements rigidly attach the fiber tips

to the 90◦ OAPs, alignment of the beam is determined entirely by tip-tilt variation of the

collimated beam. To preciselly recouple the light back into the delivery fiber, the collimat-

ing element is rigidly mounted to the filter wheel and the coupling element is mounted to a

7

(mm)

(mm)

Shaped Pupil

−5 0 5

−5

0

5

(a) Shaped Pupil

λ0/D

λ0/D

Normalized PSF from Ripple3

−15 −10 −5 0 5 10 15

−15

−10

−5

0

5

10

15 −10

−8

−6

−4

−2

0

(b) Ideal PSF

(mm)

(mm)

Shaped Pupil After DMs

−5 0 5

−5

0

5

(c) Aberrated Pupil

λ0/D

λ0/D

Aberrated PSF from Ripple3

−10 −5 0 5 10

−10

−5

0

5

10

−10

−8

−6

−4

−2

0

(d) Aberrated PSF

Figure 1.4: Example of the effect of an aberrated field incident on a Shaped Pupil corona-graph. The aberrations are simulated by Fresnel propagating the measured nominal shapesof the HCIL DMs to the pupil plane. Other sources of aberrations are not included be-cause they have not been measured. (d) The PSF of the shaped pupil with the simulatedaberrations. The figures are in a log scale, and the log of contrast is shown in the colorbars.

tip-tilt stage. To eliminate ghosts, all interference filters have a small wedge between their

exterior surfaces. To guarantee a quality alignment for all of the filters for a fixed tip-tilt,

they must all be clocked inside the filter wheel so that when they are positioned within the

beam, the wedge is aligned in the same direction. To guarantee stability of the coupling

(which is sensitive on the sub-micron level) the entire optical train is sealed from the outside

environment, eliminating any air flow through the system. With the system very compact

and light, sealed, and highly rigid (since the tip-tilt mechanism is very stiff) we observe that

the coupling is reliable over a period of weeks to months once it has been aligned. Since

8

Figure 1.5: Optical Layout of the Princeton HCIL’s Filter Wheel. The light from the SuperKsupercontinuum fiber is collimated by a Thorlabs reflective collimator (c, passes through afilter wheel which contains narrow band interference filter, and is recoupled into a KoherasPhotonic Crystal continuously Single Mode (PCSM) fiber with another reflective coupler(RC08FC). The system is rigid with the exception of the tip-tilt mechanism, which is usedto align the beam for coupling back into the PCSM fiber that delivers light onto the bench.

the original HCIL experiment had proved to be limited by the stability of its old HeNe laser

and its free space coupling into a fiber this was a critical design parameter for the filtering

scheme.

The two source configurations allows for testing of control algorithms in both monochro-

matic and broadband light (typically ∼ 10− 20% of the central wavelength).The monochro-

matic experiments allow us to test controller performance very quickly, while leaving the

results independent of any chromatic effects. Once a particular algorithm has been proven

in monochromatic light, we can use the polychromatic configuration to test its performance

over a larger bandwidth.

9

1.4 Two Deformable Mirrors in Series

Focal plane wavefront control techniques have primarily been developed and tested at the

Jet Propulsion Laboratory (JPL) High Contrast Imaging Testbed (HCIT) [24], the Subaru

telescope’s Phase Induced Amplitude Apodization (PIAA) testbed (which has been decon-

structed) [33], more recently at the NASA Ames Coronagraph Experiment (ACE) [6], and

Princeton’s HCIL described in §1.3. JPL’s HCIT is the only experiment in vacuum, and

has tested several coronagraphs using the Electric Field Conjugation (EFC) algorithm. The

primary goal at the HCIT is to test the limit of ultimate achievable contrast and IWA of

each coronagraph and estimation scheme. The Subaru telescope’s PIAA testbed was used

for the initial verification of the PIAA coronagraph [33], which uses a pair of highly aspheric

mirrors as a pupil remapping system to achieve nearly lossless apodization of the pupil for

high contrast imaging at low inner working angle. This experiment has moved to JPL’s

HCIT [34] and progress with the PIAA coronagraph at the Subaru telescope has shifted

to the Subaru Coronagraphic Extreme Adaptive Optics (SCExAO) system [50]. The ACE

experiment focuses on low inner working angle coronagraphy using the PIAA coronagraph,

primarily as a technology demonstrator for critical hardware required in an Explorer class

mission. Princeton’s HCIL is unique compared to these in that we focus on the development

of estimation and control schemes, their efficacy for a true observatory environment, and

their ability to relax coronagraphic tolerance. One of the most unique components of the

HCIL, the two DMs in series, means that both amplitude and phase aberrations are fully

controllable over the entire image plane [55]. The experiments at JPL, Ames, and Subaru

only use one DM. This means they are only capable of correcting phase perturbations on

both sides of the image plane, and energy from amplitude aberrations can only be shifted

from one side of the PSF to the other. As a result they are only capable of reaching high

levels of contrast on a single side of the image plane, as shown in Fig. 1.6. The ability to

correct symmetrically in the image plane allows us to double the discovery space for planets,

but makes the control problem (particularly in broadband light) much more challenging. The

10

(a) JPL’s HCIT [23] (b) Subaru PIAA [33] (c) ACE [7]

Figure 1.6: Single DM FPWC results from (a) JPL’s HCIT [23], (b) Subaru PIAA testbed[33], and (b) NASA’s ACE [7]. All three facilities use a single DM for, which is why all theresults only exhibit a dark hole on a single side of the image plane. To achieve symmetriccorrection, as will be shown in Ch. 5, at least two DMs are needed to achieve any amountof amplitude controllability for symmetric dark hole correction.

presence of two DMs at planes that are not conjugate to the pupil plane will be an under-

lying theme to the mathematical development for wavefront estimation and control, as well

as many of the experimental challenges addressed in this thesis. By doubling the discovery

space two DMs in series increases the likelihood of detecting an exoplanet in any mission,

and adds redundancy in the wavefront control system. As a result, many wavefront control

architectures for planet finding missions assume a 2-DM system [43, 69, 31, 45, 44, 70, 14].

With Princeton’s HCIL being the only laboratory with this capability, the limitations and

results of our work provides a unique and relevant body of information for future coron-

agraphic instrumentation. Pueyo [59] proved that, to first order, two-DMs in series could

correct both amplitude and phase, showed it was achievable over a bandwidth [54], and has

indicated its necessity for coronagraphy on segmented apertures [56]. Kay [40] developed a

DM independent estimation scheme to avoid the compounding effect of optical model error

in a two-DM system and used this to generate symmetric dark holes over the largest pub-

lished areas of the image plane, albeit at more modest contrast levels. We have also begun to

develop algorithms that are capable of creating symmetric dark holes over finite bandwidths

[29]. Overall, these experiments represent the only body of work dedicated to symmetric

11

dark hole generation and this thesis is a continuation of that effort.

1.5 Fourier Optics

All of the control software and coronagraph designs that appear in this thesis were produced

using Fourier optics. To more fully appreciate their validity and limitations, the relevant inte-

grals are derived here beginning with the Rayleigh-Sommerfeld diffraction integral. Looking

Figure 1.7: Relevant coordinates, vectors, and frames to propagate light from Σ to Σ′.

at Fig. 1.7, we begin by assuming a field on an arbitrary surface, Σ, originating from the

point O. We wish to propagate this field to a plane centered about O′, a distance z away.

Using a vector notation from Kasdin and Paley [36], we define the chief ray from O to O′ as

n =rO′/O

||rO′/O||. (1.5.1)

The Rayleigh-Sommerfeld integral evaluates the incident electric field on an arbitrary point,

p, in the second plane, Σ′, from every point, q, in the first plane, Σ. We define the vector

12

from O to q and from O to p as

rq/O = u0eu + v0ev, and (1.5.2)

rp/O = ueu + vev + zez. (1.5.3)

Thus, we evaluate the vector from q to p as

rp/q = rp/O − rq/O (1.5.4)

= (u− u0)eu + (v − v0)ev + zez. (1.5.5)

With all propagation vectors in place, the Rayleigh-Sommerfeld diffraction integral [25] de-

scribes the field at point p as

E(p) =1

iλ

∫Σ

E(q)cos(n, rp/q)

||rp/q||ei

2πλ||rp/q ||dS (1.5.6)

=1

iλ

∫Σ

E(q)n · rp/q||rp/q||

ei2πλ||rp/q ||dS, (1.5.7)

where

rp/q =rp/q||rp/q||

(1.5.8)

is the unit vector pointing from q to p. Even though the result is scalar, the integral is

vector-based and quite complicated to solve. We simplify Eq. 1.5.7 by first applying the

paraxial approximation. This assumes that lateral deviation from the chief ray is so small

that every ray vector has propagated the same distance as the chief ray, making

||rp/q|| ≈ z. (1.5.9)

13

We also assume that for any combination of p and q, rp/q is parallel to n, which means

n · rp/q ≈ 1. (1.5.10)

Applying these simplifications to the terms preceding the exponential in Eq. 1.5.7, the

Rayleigh-Sommerfeld diffraction integral simplifies to

E(p) =1

iλz

∫Σ

E(q)ei2πλ||rp/q ||dS, (1.5.11)

which is the vector form of the Huygens-Fresnel integral. The application of the paraxial

approximation means that Eq. 1.5.11 is limited to narrow-field imaging. Fortunately in

coronagraphy, we are concerned with very small angular separations.

1.5.1 Propagation: Fresnel Transform

It may at first appear strange that we do not apply the approximation of Eq. 1.5.9 to the

exponential and further simplify Eq. 1.5.11. This is because the 1/λ term is of order 106−107,

which amplifies small errors due to the approximation and causes rapid 2π periodic errors

in the phase of the integrand. We instead make the Fresnel approximation by taking a first

order expansion of the ||rp/q|| term in the exponential. In scalar coordinates, this becomes

||rp/q|| = z

√1 +

(u− u0

z

)2

+

(v − v0

z

)2

(1.5.12)

≈ z

[1 +

1

2

[(u− u0)2 + (v − v0)2

]−O(2) + . . .

]. (1.5.13)

14

With this approximation we find that the Fresnel Integral is

E(u, v, z) =ei

2πλz

iλz

+∞∫∫−∞

E(u0, v0)eiπλz [(u−u0)2+(v−v0)2]du0dv0. (1.5.14)

= Fz{E(u0, v0)} (1.5.15)

These approximations only hold for a narrow field, and the following equations will not

accurately reflect the image distortion at large angular separation. The validity of the Fresnel

approximation can be determined by evaluating the next highest term in Eq. 1.5.13. The

propagation distance required for this term to contribute << 1 radian in the exponential of

Eq. 1.5.15 is

z3min >>

π

4λ

[(u− u0)2 + (v − v0)2

]2. (1.5.16)

Goodman [25] points out that this is a conservative estimate and makes arguments for

a softer constraint if the aperture has fine structure and is illuminated by uniform plane

waves. Otherwise it is safer to apply Eq. 1.5.16 to determine the validity of the integral.

In the presence of deformable mirrors and shaped pupil coronagraphs the HCIL does have

sinusoidally varying transmission and phase across virtually any plane, requiring that we

must in fact check that Eq. 1.5.16 is satisfied. Thus, for the HCIL’s standard 10mm pupil

and 550nm light (the shortest wavelength used in the laboratory), a Fresnel transform only

accurately reflects the diffraction pattern beyond a distance of zmin = 38.5cm. The shortest

free space propagation in the laboratory is between the two DMs, a distance of∼ 47cm, which

exceeds zmin by a factor of 1.2. Knowing this is a conservative calculation, this is likely large

enough, particularly since the incident field on DM1 is a uniform plane wave and there are

no amplitude variations. The distance from DM2 to the pupil plane is ∼ 1.13 m ≈ 3zmin,

which is the last plane that the Fresnel integral must be applied. Since the amplitude

non-uniformity at progressive planes is a direct result of phase to amplitude mixing, larger

15

distances would actually make the non-uniformity in the field worse. Thus it is likely that

larger propagation distances for such a small aperture would not solve the problem, and the

only solution is to use a higher fidelity integral if more precision is required. To date, the

fresnel integral has not been found to be a limiting factor but this should be taken as a point

of caution since two DMs in series have never proven dark holes below the levels reported in

this thesis.

1.5.2 Imaging: Fourier Transform

The advantage of Fourier optics is the simplicity with which we can relate the pupil and

image planes. To show this, we will consider a field incident on a lens, Ein, and evaluate

the field one focal length downstream of the optic. We first define a lens operator, which

describes the field exiting the lens as

Eout = L{Ein} (1.5.17)

In a true optic with finite thickness (even a mirror), the operator would be nonlinear and

require a continuous integral in z over the entire displacement of the lens surface, or sag.

This complicates computing the diffractive effect of a lens, so we seek an approximation to

simplify the computation. We will first assume that the lens has a parabolic profile. In

ray optics, a parabolic lens has the merit of maintaining an equal path length between the

focal point and collimated plane of the optic, regardless of the ray we choose. This implies

a constant phase change over the entire field at the output, but to see a similar advantage

in diffractive optics we must assume that the optic is infinitely thin. With its effect being

confined to a plane, its parabolic shape only contributes a phase to the incident field. Since

the primary imaging optic is an F/10 parabolic surface with a 6 inch diameter, and we under

fill the mirror by a factor of ≈ 15, the sag of the optic is less than a millimeter (< 10%) of the

relevant aperture. Additionally, we are observing sufficiently on-axis that any affect on the

16

image from the mirror curvature will not be significant, making the infinitely thin, parabolic

phase assumption valid in our system. Defining this parabolic phase to be a function of the

focal length of the optic, f , the lens operator simplifies to

L{Ein(ξ, η)} = Ein(ξ, η)e−iπλf

(ξ2+η2). (1.5.18)

The thin lens approximation is increasingly valid for larger F/# = f/D systems because the

sag is small. To propagate the output field from lens to the image plane using Eq. 1.5.15,

we will also assume that the optic is of infinite extent. The most rigorous way to quantify

this is to show that nearly 100% of the energy is contained within the optic, which typically

requires that the imaging optic be oversized compared to the incident beam by at least a

factor of two. Since we overfill by a factor of 15, we are in fact containing nearly all of the

energy from the pupil. Applying the Fresnel integral (Eq. 1.5.15) to the field L{E1(ξ, η)},

we find the field one focal length downstream of the optic (Fig. 1.8) to be

Eim(x, y, f) =ei

2πλf

iλf

+∞∫∫−∞

L{E1(ξ, η)}ei πλf [(x−ξ)2+(y−η)2]dξdη (1.5.19)

=ei

2πλf

iλf

+∞∫∫−∞

E1(ξ, η)e−iπλf

(ξ2+η2)eiπλf [(x−ξ)2+(y−η)2]dξdη

=ei

2πλf

iλfei

πλf

(x2+y2)

+∞∫∫−∞

E1(ξ, η)e−i2πλf

[ξx+ηy]dξdη

=ei

2πλf

iλfei

πλf

(x2+y2)F {E1(ξ, η)} . (1.5.20)

Neglecting the piston in phase, this shows that the electric field at this final plane is a

quadratic phase factor multiplied by the Fourier transform, F{·}, of the incident field on

an infinitely thin parabolic lens of infinite spatial extent. We define this plane as the image

plane, which corresponds to the focal point of the optic in a ray trace.

Now we will consider the special case shown in Fig. 1.8, where we generate the incident

17

Figure 1.8: The relevant planes to Fourier imaging in astronomical optics is the field, E0 anarbitrary plane a distance z prior to the imaging optic and the field immediately incident onthe plane of the optic, E1. In both cases the image plane electric field, Eim, is located onefocal length, f , downstream of the optic.

field on the optic, E1, by Fresnel propagating an arbitrary field, E0, from one focal length

upstream of the lens. In order to describe the image plane field in Eq. 1.5.20 as a function

of E0 we will relate it to the field E1 via the Fresnel integral (Eq. 1.5.15). It is pointed out

by Goodman [25] (and many optics textbooks) that this is simply the convolution

E1(ξ, η) = E0 ∗ h =

+∞∫∫−∞

E0(u, v)h(ξ − u, η − v)dudv, (1.5.21)

where the kernel of this integral is

h(ξ, η) =ei

2πλf

iλfei

πλf [ξ2+η2]. (1.5.22)

Recognizing that we ultimately seek the Fourier transform of E1(ξ, η), not E1(ξ, η) itself,

we take the Fourier transform of Eq. 1.5.21 directly. By applying the Fourier convolution

18

theorem, we find

F{E1(ξ, η)} = F{E0} F{h} (1.5.23)

=ei

2πλf

iλfF{E0}

+∞∫∫−∞

eiπλf [ξ2+η2]e−i

2πλf

[ξx+ηy]dξdη

=ei

2πλf

iλfF{E0}e−i

πλf

(x2+y2)

+∞∫∫−∞

eiπλf [(x−ξ)2+(y−η)2]dξdη

= ei2πλfe−i

πλf

(x2+y2)F{E0}. (1.5.24)

Applying Eq. 1.5.24 to Eq. 1.5.20, the image plane field becomes

Eim(x, y, f) =ei

4πλf

iλfF {E0(u, v)} . (1.5.25)

Thus, apart from a piston phase term, Eq. 1.5.25 shows that the electric field one focal length

downstream of the optic, defined earlier as the image plane, is an exact Fourier transform of

the electric field incident one focal length upstream of the optic. We define this as the pupil

plane. Throughout this thesis the image and pupil are defined as exact Fourier conjugates

of one another. For the purposes of FPWC we will use the image plane as a reference,

requiring that a pupil be exactly one focal length in front of an imaging optic. The constant

phase term in Eq. 1.5.25 is of no consequence since the original field is based off an arbitrary

reference field with zero phase, and is often neglected in many texts. The pupil plane at

the HCIL is 10 mm in diameter, designed to inscribe the kilo-DM aperture. We use a 152.4

mm diameter OAP with a 1.524 meter focal length to form the primary image. The system

is F/152.4 at the primary image, and the overfill factor of the imaging optic is 15.24, more

than adequate to use the infinite optics approximation in the control algorithms.

19

1.6 Controllability of Amplitude and Phase

Before we start producing control algorithms, we must prove that by placing two DMs in

non-conjugate planes we make both amplitude and phase aberrations controllable, allowing

us to create symmetric dark holes in the image plane. We will do so in two different ways.

First, we follow the work of Pueyo [59] and make an argument based in the pupil plane that

two DMs in series are capable of correcting both amplitude and phase. Since we ultimately

seek controllability at the image plane, we will also prove controllability by directly modeling

the effect on the image plane electric field from an arbitrary perturbation at a non-conjugate

plane. In addition to proving the controllability of both amplitude and phase at the image

plane, the result will also provide a model that we can use in the estimation and control

algorithms of Chs. 2, 3, and 4. In both cases we will treat the modulation of the DM shape

as a perturbation to the electric field at an arbitrary plane, p, upstream of the pupil, as

shown in Fig. 1.9. We account for the fact that the DM is of finite size by including the

aperture function, Ap, but the perturbation induced by the DM is exclusively in phase.

Figure 1.9: Phase perturbations at plane p are propagated to the pupil plane, defined asbeing on focal length away from the imaging optic. The lens acts as a Fourier transformingdevice, producing the image plane electric field one focal length after the optic.

20

1.6.1 Pupil Plane Controllability: Angular Spectrum

Beginning with the controllability proof by Pueyo [59], we decompose the intermediate plane,

p, shown in Fig. 1.9 into a Fourier series. We then Fresnel propagate this field to the pupil

plane and apply the pupil function, Apup, to find the pupil plane electric field, Epup. Since

we are ultimately decomposing the solution to the Fresnel integral into a Fourier series, this

is commonly referred to as the angular spectrum approximation. Given a unitary input field

incident on a DM at plane p that is inducing a phase perturbation, φ, over the aperture, Ap,

the electric field is given by

Ep(ξ, η) = Ap(ξ, η)eiφ(ξ,η) (1.6.1)

Next we expand the exponential as a Taylor series and take a first order approximation,

assuming that the phase perturbations are small enough that the second order term in the

expansion is negligible. The linearized field is

Ep(ξ, η) ≈ Ap(ξ, η) [1 + iφ(ξ, η)] (1.6.2)

We now decompose the phase perturbations at plane p (Fig. 1.9), φ(ξ, η), into a sum of spatial

frequencies. Following notation similar to Pueyo [59], we describe φ as a Fourier series in

Cartesian coordinates. Summing over the integers (n,m) ∈ [−∞,∞] with amplitude bm,n,

the linearized phase perturbation induced by the DM is

φ(ξ, η) =∑m,n

bm,nei 2πD

(mξ+nη). (1.6.3)

21

Applying Eq. 1.6.3 to Eq. 1.6.2, the linearized field at plane p becomes

Ep(ξ, η) ≈ Ap(ξ, η) + iAp(ξ, η)∑m,n

bm,nei 2πD

(mξ+nη) (1.6.4)

, Enom + Epert, (1.6.5)

where we have defined the unperturbed, or nominal, component of the field as Enom and

the perturbed component of the field induced by the DM, Epert. The field due to the DM

perturbation can now be propagated a distance z to a second plane via a Fresnel Trans-

formation. For simplicity in notation we assume that this is the pupil plane, as shown in

Fig. 1.9. Applying Eq. 1.5.15, the effect of the perturbation, Epert, at the pupil plane is

Epert,pup(u, v) = i∑m,n

bm,nFz{Ap(ξ, η)ei2πD

(mξ+nη)} (1.6.6)

= i∑m,n

bm,nei

2πλz

iλz

+∞∫∫−∞

Ap(ξ, η)ei2πD

(mξ+nη)eiπλz

[(u−ξ)2+(v−η)2

]dξdη. (1.6.7)

Applying the coordinate transformations,

(u− ξ)2 =

(u− ξ − mλz

D

)2

−(mλz

D

)2

+ 2u

(mλz

D

)− 2ξ

(mλz

D

)(v − η)2 =

(v − η − nλz

D

)2

−(nλz

D

)2

+ 2v

(nλz

D

)− 2η

(nλz

D

),

the sum of spatial frequencies can be pulled out of the integral, making the field


bm,nei 2πD

(mu+nv)e−iπλzD2 (m2+n2)

· ei 2πλz

iλz

+∞∫∫−∞

Ap(ξ, η)eiπλz

[(u−ξ+mλz

D )2+(v−η+nλz

D )2]dξdη.

(1.6.8)

22

The result is the angular spectrum approximation. It is the same perturbation expansion

as Eq. 1.6.7, but simplifies the integrand of the Fresnel transform by using shifted output

coordinates,

u′ = u− mλz

D(1.6.9)

v′ = v − nλz

D. (1.6.10)

We indicate this coordinate shift in our operator notation with a subscript, Fz{·}(u′,v′).

Applying this notation, Eq. 1.6.8 is written as


bm,nei 2πD

(mu+nv)e−iπλzD2 (m2+n2)Fz{Ap(ξ, η)}(u′,v′). (1.6.11)

Eq. 1.6.11 shows that the perturbed field at plane p can be computed as a sum of shifted and

complex weighted values of the fresnel transformed nominal field, Ap. Each term is shifted

to the corresponding spatial frequency in the summation. Examining the value of this shift

for the given laboratory parameters,

λmax ≈ 700 nm

nmax,mmax = 16 cycles/aperture

zmax ≈ 1.6 m

D = 10.8 mm,

we see that the maximum shift relevant for wavefront control is

nmaxλmaxzmaxD

= 1.65 mm. (1.6.12)

In our nominal control configuration at 10 cycles/aperture in 635 nm light this value becomes

less than 1 mm. Even so, this is a significant fraction of the 10 mm pupil diameter used

23

in the laboratory so these shifts should not be neglected if this transformation were to be

used to compute the field for control. Interestingly, the sharp edges of a shaped pupil mean

that we cannot truncate the series to a small set for fear of introducing a Gibbs effect into

numerical model. This precludes the utility of the angular spectrum factor for control, but it

is convenient for rigorously proving that (to first order) two DMs can control both amplitude

and phase aberrations. Recalling from Eq. 1.6.4 that the entire field incident on the pupil is

a sum of the nominal field and the perturbed component, we write the incident field on the

pupil plane as

Epup(u, v) = Fz{Ap(ξ, η)}+ i∑m,n

bm,nei 2πD

(mu+nv)e−iπλzD2 (m2+n2)Fz{Ap(ξ, η)}(u′,v′). (1.6.13)

Applying the pupil function, Apup(u, v), to the incident field we find

Epup(u, v) =Apup(u, v)Fz{Ap(ξ, η)}+

i∑m,n

bm,nei 2πD

(mu+nv)e−iπλzD2 (m2+n2)Apup(u, v)Fz{A(ξ, η)}(u′,v′).

(1.6.14)

We will make one more simplification to Eq. 1.6.14 by assuming that the aperture function

of the DM, Ap(ξ, η), sufficiently overfills the pupil plane aperture, Apup, such that

Apup(u, v)Fz{Ap}(u,v) ≈ ApupFz{Ap}(u′,v′) ≈ Apup(u, v). (1.6.15)

This assumption is non-intuitive because we have decomposed the field into spatial frequen-

cies, but it is equivalent to assuming that the Fresnel ringing from the edges of the DM is

negligible because it is blocked by the aperture at the pupil plane, Apup(u, v). Under this

assumption the perturbed field at the pupil plane, Eq. 1.6.14, simplifies to

Epup(u, v) ≈ Apup(u, v)

[1 + i

∑m,n

bm,nei 2πD

(mu+nv)e−iπλzD2 (m2+n2)

]. (1.6.16)

24

To prove controllability of amplitude and phase, Pueyo [59] first points out that if the phase

perturbations are small enough we may approximate the aberrated electric field, Eabr, with

arbitrary amplitude aberrations, Aabr, and phase aberrations, φabr, as

Eabr(u, v) = Aabr(u, v)eiφabr(u,v) (1.6.17)

≈ Aabr(u, v)(1 + iφabr(u, v)). (1.6.18)

Thus, in the linear approximation, phase aberrations are purely imaginary and amplitude

aberrations are purely real in the pupil. He then points out that the quadratic exponential

in Eq. 1.6.16 effectively rotates the phase of each term in the series, making the contribution

of each term a mixed complex value rather than a purely imaginary term. By linearizing

this quadratic term and assuming there is a second DM exactly at the pupil plane, Pueyo

[59] used a phasor representation to show that both real and imaginary aberrations can be

perfectly conjugated at the pupil plane if we use a second DM at plane p to perturb the

field. Since each component of the series summation can exactly conjugate an aberration at

the pupil, It follows that these aberrations have been suppressed in the image plane [59].

1.6.2 Image Plane Controllability: The Propagation Factor

The argument made in Pueyo [59] was the first rigorous proof that DMs at non-conjugate

planes were capable of controlling both amplitude and phase, but it required several approx-

imations and linearizations of field perturbations at both the intermediate and pupil planes.

We would like to understand the effect of a truly arbitrary perturbation from a DM at plane

p on the image plane because this is ultimately where we want to control the field. Doing

so necessitates propagating a non-conjugate plane all the way to the image plane so that we

can compute the control effect, and evaluate how well we can correct a complex-valued field

as a function of location in the image plane.

Since we are also trying to produce a numerical model for control, we begin by assess-

25

x (mm)

y(m

m)

DM1 Nominal Surface

−5 −4 −3 −2 −1 0 1 2 3 4 5

−5

−4

−3

−2

−1

0

1

2

3

4

5

Surface

Pro

file

(nm)

0

100

200

300

400

(a) DM1 Nominal Shape

x (mm)

y(m

m)

DM2 Nominal Surface

−5 −4 −3 −2 −1 0 1 2 3 4 5

−5

−4

−3

−2

−1

0

1

2

3

4

5

Surface

Pro

file

(nm)

−300

−200

−100

0

100

200

300

(b) DM2 Nominal Shape

Figure 1.10: Interferometric measurement of the uncontrolled shape of the HCIL’s two kilo-DMs. Note the complex structure of the nominal surface. The amplitude of the high spatialfrequency component is ≈ 20− 30 nm on each surface. Low order spherical and cylindricalmodes have amplitudes on the order of 100’s of nanometers.

ing the accuracy of numerically propagating the DM surfaces via the Fresnel integral. As

shown in Fig. 1.10, the nominal surface of each DM is quite complex and contains high

spatial frequency errors with amplitudes of approximately 5% of the wavelength used in the

experiment. Indeed, the high resolution of the DM actuation also implies commands that

oscillate rapidly across the aperture. Capturing this well would require very high sampling

when we solve the Fresnel integral numerically, dramatically increasing our sensitivity to

numerical error and making the computation very slow (a bad thing for real-time control).

We seek to simplify our transformation from plane p to the image plane so that we will be

less sensitive to the discretization of our numerical integrator. This simplification will also

clearly demonstrate how the image plane electric field, Eim(x, y), changes as we move the

DM a distance z from the pupil.

Beginning with Eq. 1.6.1, we Fresnel propagate an arbitrary field at plane p to the pupil

26

plane. Applying the pupil function Apup, the total field at the pupil plane is

Epup(u, v) = Apup(u, v)Fz{Ap(ξ, η)eiφ(ξ,η)} (1.6.19)

= Apup(u, v)ei

2πλz

iλz

+∞∫∫−∞

Ap(ξ, η)eiφ(ξ,η)eiπλz [(u−ξ)2+(v−η)2]dξdη (1.6.20)

=[ApupApe

iφ]∗ h(u, v), (1.6.21)

where the kernel to the convolution is

h(u, v) =ei

2πλz

iλzei

πλz

(u2+v2). (1.6.22)

Using the Fourier transform relationship between the image and pupil plane given by Eq. 1.5.25,

the field at the image plane is

Eim(x, y) =ei

4πλf

iλfF{[ApupApe

iφ]∗ h(u, v)

}(1.6.23)

=ei

4πλf

iλfF{ApupApe

iφ}F{h(u, v)}

(1.6.24)

= −ei 2πλ

(2f+z)

λ2fzF{ApupApe

iφ}+∞∫∫−∞

eiπλz (u2+v2)e−i

2πλf

[ux+vy]dudv (1.6.25)

= −ei 2πλ

(2f+z)

λ2fzF{ApupApe

iφ}+∞∫∫−∞

e−i πz

λf2 (x2+y2)eiπλz [(u−

zfx)2+(v− z

fy)2]dudv (1.6.26)

=ei

2πλ

(2f+z)

iλfF{ApupApe

iφ}e−i πz

λf2 (x2+y2). (1.6.27)

By applying a quadratic phase term at the image plane, Eq. 1.6.27 accounts for the prop-

agation of an arbitrary field from plane p to the pupil plane. We refer to this quadratic term

as the propagation factor. For control, the power of this result lies in the fact that we do not

have to integrate the field twice to compute the field at the image plane from a perturba-

tion at plane p. Note that in most scenarios, particularly with a shaped pupil coronagraph,

27

we can simplify Eq. 1.6.27 by recognizing that the aperture function will underfill the DM

aperture, making ApupAp ≈ Apup. This is not a critical simplification, but makes the result

slightly easier to understand since the PSF is dominated by the coronagraph.

The propagation factor in Eq. 1.6.27 mixes the effect of the DMs perturbation to the

electric field, φ, between real and imaginary parts in the image plane. One DM at a non-

conjugate plane cannot correct both amplitude and phase aberrations by itself, but rather

corrects a specific mixture of the two dictated by the propagation factor. Having established

one DM a distance z = zp away from the pupil, we will include a second DM at yet another

plane, q, a distance zq away from the pupil. By assuming that the contribution of each DM

at the image plane electric field is additive for sufficiently small stroke, as proven in Pueyo

and Kasdin [54] and Pueyo et al. [55], we aim to show sufficient coverage of the real and

imaginary parts of the electric field in the image plane. It is now important to note that

showing independent and simultaneous control of the real and imaginary parts of the field

is equivalent to demonstrating control over amplitude and phase. The choice of one over

the other is a matter of convenience in representing the field. We will find in the estimation

and control chapters that it is more convenient to represent the image plane in real and

imaginary parts, so for the sake of consistency we will also prove controllability of the image

plane electric field in the same manner.

The choice of the propagation distance for each DM, zp and zq, to the pupil is not critical

except to guarantee enough phase to amplitude mixing once the field reaches the pupil. In a

conventional system, one DM would be conjugate to the pupil and the second non-conjugate

to the pupil. However, this has the disadvantage of requiring additional re-imaging optics to

conjugate one mirror to the pupil. We will show that conjugating one DM to the pupil also

reduces our coverage of the image plane, meaning that we will have poor controllability over

both real and imaginary parts of the field across the entire image plane. To clarify this second

point, we examine the contribution of a DM in Eq. 2.1.5 when it is non-conjugate to the

pupil. Applying Euler’s formula to Eq. 1.6.27, we find the real and imaginary components

28

of the field to be

Eim(x, y) =ei

2πλ

(2f+z)

iλfF{[ApupApe

iφ]}

·[sin

(πz

λf 2(x2 + y2)

)+ i cos

(πz

λf 2(x2 + y2)

)].

(1.6.28)

We now see that the contribution of the propagation factor for a single DM oscillates between

being purely real and purely imaginary across the image plane. The consequence is that one

DM non-conjugate to the pupil exhibits regions in the image plane with poor mixing between

real and imaginary components of the field. Using relevant parameters from the HCIL, and

choosing zp = z1 for DM1 and zq = z2 for DM2, Fig. 1.11 shows the overall effect this has in

the image plane by plotting the magnitude of the real term in Eq. 1.6.28, | sin( πzλf2 (x2 +y2))|.

Fig. 1.11(a) and Fig. 1.11(b) shows that there is a rapid oscillation in the degree to which

the control effect becomes either purely real or purely imaginary, even at low to mid spatial

frequencies. By choosing z2 such that its propagation factor oscillates with a different period

of the image plane, we have the ability to simultaneously control the real and imaginary

parts of the field over the entire image plane. For example, Fig. 1.11(a) shows that the

magnitude of the real component of the first DM’s propagation factor is very close to zero

at approximately 10λ0/D. Looking at Fig. 1.11(b), we see that the magnitude of the real

term for the second DM’s propagation factor is much larger. The fact that the mixing from

each propagation factor is different means that we can arbitrarily correct either the real or

imaginary part of the field at this location of the image plane. Another way to see this is

to actuate the same spatial frequency, w, on each DM, each with its own amplitude, b1 and

b2. Scaling by the wavelength, λ, the aperture size, D, and the focal length of the imaging

optic, f , If we choose DM1’s perturbation to be

φ1(ξ, η) = b1 cos

(2πwDξ

λf

), (1.6.29)

29

λ0/D

λ0/D

Real Term for DM1

−10 −5 0 5 10

−10

−5

0

5

10

0

0.2

0.4

0.6

0.8

1

(a) DM1 Propagation Factor

λ0/D

λ0/D

Real Term for DM2

−10 −5 0 5 10

−10

−5

0

5

10

0

0.2

0.4

0.6

0.8

1

(b) DM2 Propagation Factor

λ0/D

λ0/D

Real Part, Overlay of Both DMs

−10 −5 0 5 10

−10

−5

0

5

10

0

0.2

0.4

0.6

0.8

1

(c) Real Part Combined

λ0/D

λ0/D

Imaginary Part, Overlay of Both DMs

−10 −5 0 5 10

−10

−5

0

5

10

0

0.2

0.4

0.6

0.8

1

(d) Imaginary Part Combined

Figure 1.11: Effect of the DM propagation on the real and imaginary parts of the electricfield at the image plane. (a) and (b) show the magnitude of the real part due to theangular spectrum factor. (c) and (d) overlay the contribution of both DMs to the real andimaginary parts of the field respectively, indicating that there is good coverage of both realand imaginary terms in search area up to the control limit of the DM. This indicates that thesystem has a high degree of controllability over both amplitude and phase in monochromaticlight.

where ξ is the physical coordinate in the DM1 plane. We also apply the same shape for

30

DM2, making its perturbation to the field

φ2(σ, τ) = b2 cos

(2πwDσ

λf

), (1.6.30)

where σ is the physical coordinate in the DM2 plane. We now assume that b1 and b2 are

small enough that

A1(ξ, η)eiφ1(ξ,η) ≈ A1(ξ, η)(1 + iφ1(ξ, η)) (1.6.31)

A2(σ, τ)eiφ2(σ,τ) ≈ A2(σ, τ)(1 + iφ2(σ, τ)) (1.6.32)

we can use Eq. 1.6.27 to write the perturbation incident on the image plane from DM1 and

DM2 as

Epert,DM1 =ei

2πλ

(2f+z1)

iλfF{iApupb1 cos

(2πwDξ

λf

)}e−i πz1

λf2 (x2+y2) (1.6.33)

=b1

2

ei2πλ

(2f+z1)

iλfe−i πz1

λf2 (x2+y2)F {Apup} ∗ (δ(x− wD) + δ(x+ wD)) (1.6.34)

=b1

2

ei2πλ

(2f+z1)

iλfe−i πz1

λf2 (x2+y2)[F {Apup}(x−wD) + F {Apup}(x+wD)

], (1.6.35)

and

Epert,DM2 =ei

2πλ

(2f+z2)

iλfF{iApupb2 cos

(2πwDξ

λf

)}e−i πz2

λf2 (x2+y2) (1.6.36)

=b2

2

ei2πλ

(2f+z2)

iλfe−i πz2

λf2 (x2+y2)F {Apup} ∗ (δ(x− wD) + δ(x+ wD)) (1.6.37)

=b2

2

ei2πλ

(2f+z2)

iλfe−i πz2

λf2 (x2+y2)[F {Apup}(x−wD) + F {Apup}(x+wD)

](1.6.38)

respectively. With the same spatial frequency applied to both DMs, we apply two shifted

copies of the PSF at the same locations in the image plane. They will each have an amplitude

chosen by the amplitude of the perturbation applied to the DM, and a phase dictated by

31

their propagation factor. Since their total contribution to the image plane is

Epert = Epert,DM1 + Epert,DM2 (1.6.39)

=ei

4πλf

i2λf

[b1e

i 2πλz1e−i πz1

λf2 (x2+y2) + b2ei 2πλz2e−i πz2

λf2 (x2+y2)]·[

F {Apup}(x−wD) + F {Apup}(x+wD)

], (1.6.40)

we can vary b1 and b2 relative to each other to produce whatever ratio of real and imaginary

parts we like at the image plane location (x, y) = (±wD, 0). If we describe the perturbation

from each DM surface as the sum of all controllable spatial frequencies we can extend this

effect to any point in the image plane. Thus, the coverage provided by complementary

propagation factors gives us sufficient degrees of freedom to simultaneously correct real and

imaginary components of the field. As seen in Eq. 1.6.40, it is this ability that allows us to

create symmetric dark holes in the image plane, as opposed to the single sided correction

shown in Fig. 1.6. To demonstrate the quality of coverage over the entire image plane,

Fig. 1.11(c) and Fig. 1.11(d) overlays the contribution of both DMs to the real and imaginary

parts of the field, respectively. Their relative separations favors coverage for the real part

of the field at small working angles, but both real and imaginary components exhibit good

coverage and never go to zero (note that inside the core of the PSF the value does not

matter). In the example shown, the only nulls are nearly at the 16λ/D controllable limit of

the the HCIL DMs (a consequence of the fact that the maximum spatial frequency a DM

can directly command is half of the number of actuators across the aperture).

We have now shown that the absolute value of zp and zq matters much less for con-

trollability than their relative magnitudes. Looking to Eq. 1.6.28, if zp/zq were a multiple

of 2π they would have identical propagation factors and we would have poor coverage in

certain areas of the image plane. This effect was also recognized by Shaklan and Green [63],

but in the pupil plane. He points out that if a particular spatial frequency of the field can

effectively be reconstructed if it is propagated by the Talbot distance, zt = 2D2/λw2 in our

32

notation. This is equivalent to the argument made here for controllability in the image plane.

Furthermore, had we chosen the DM at plane q to be conjugate to the pupil by including

re-imaging optics, zq would be zero and the DM at plane q would not be able to compensate

for regions where we have poor control with the non-conjugate DM at plane p.

Having demonstrated good controllability of both the real and imaginary parts of the

image plane electric field over the entire controllable space, we can guarantee good con-

trollability of both amplitude and phase aberrations incident on the system. Additionally,

Eq. 1.6.27 gives us a model for computing the control effect of a DM at a non-conjugate

plane. We will use this propagation factor extensively in lieu of applying multiple numerical

transformations to produce our estimation and control algorithms.

1.7 Numerical Transform

Since the purpose of the DMs is to produce arbitrary surfaces that correct an arbitrary set

of aberrations at the image plane (within the controllable limit of the DM), we cannot rely

on analytical integration to find solutions for the DM shapes. Thus, our control laws require

that we use numerical integration techniques to relate the DM, pupil, and image planes using

the Fourier optics techniques described in §1.5. To that end we will use a matrix formulation

to take the two-dimensional Fourier transform. For the purposes of defining zero at the

center of each plane we will consider the dimension of the pupil and image as a quarter

plane, defined by the number of elements (Npup,Mpup) and (Nim,Mim) respectively. The

dimension of each plane is defined in Table 1.1.

Pupil Coordinates Dimension

u (2 ∗Npup)× 1

v (2 ∗Mpup)× 1

Image Coordinates Dimension

x (2 ∗Nim + 1)× 1

y (2 ∗Mim + 1)× 1

Table 1.1: Coordinates in each plane

33

Keeping in mind the dimension of Table 1.1, we discretize the integration along the

coordinates (u, v) in the pupil as (uk1 , vk2), where (k1, k2) are integers from −Npup : Npup and

−Mpup : Mpup respectively. Likewise, we discretize the image plane coordinates, (x, y), as

(xj1 , yj2), where (j1, j2) are integers from −Nim : 0 : Nim and −Mim : 0 : Mim respectively.

Using this notation the pupil field, A, is discretized with indices Ak1,k2 .Using the discretized

pupil and image plane coordinates, we write the Fourier transform in Eq. 1.5.25 as a finite

sum. Following a method similar to [72, 15], we describe the discrete form of the Fourier

transform, denoted by ·, at a particular pixel location in the image plane as

Aj1,j2 =

Npup∑k1=−Npup

Mpup∑k2=−Mpup

e−i2πλfvk2

yj2Ak1,k2e−i 2π

λfuk1

xj1 . (1.7.1)

Using the column vectors defined in Table 1.1, each element of the image Aj1,j2 can

be directly encoded into a two-dimensional matrix by writing the elements of the Fourier

integrand as

A = e−i2πλf

(y·vT ) · A · e−i 2πλf

(u·xT ) · du · dvλf

, (1.7.2)

where du and dv are the physical dimension of each pixel in the pupil plane, λ is the wave-

length under consideration, and f is the focal length of the imaging optic. Describing an

34

individual element of A as ak1,k2 , the elements of the resultant matrices are

e−i2πλf

(y·vT ) = exp

−i2πλfy−Mim

v−Mpup . . . y−MimvMpup

......

yMimv−Mpup . . . yMim

vMpup

(1.7.3)

A =

a−Mpup,−Npup . . . a−Mpup,Npup

......

aMpup,−Npup . . . aMpup,Npup

(1.7.4)

e−i2πλf

(u·xT ) = exp

−i2πλfu−Npupx−Nim . . . u−NpupxNim

......

uNpupx−Nim . . . uNpupxNim

(1.7.5)

Where 1/λf scales the amplitude of the result and du · dv scales the resultant image plane

in physical units. Note that the constant phase factor and imaginary term (which itself is

only a −π/2 phase shift across the plane) have been left out because it is unnecessary for the

control code. The first summation of Eq. 1.7.1 is found for every element along the entire

first index by multiplying the last two matrices of Eq. 1.7.2. We define this intermediate

matrix as

G(j1, k2) = A · e−i 2πλf

(u·xT ) · du · dvλf

(1.7.6)

The rows of G(j1, k2) encode the values of the pupil index along the second dimension, k2, and

the columns encode the values of the image plane index along the first dimension, j1. Thus,

we index an individual element of G as gj1,k2 . The final matrix multiplication completes the

summation, making the two matrix multiplications equivalent to evaluating

gj1,k2 =

Npup∑k1=−Npup

Ak1,k2e−i 2π

λfuk1

xj1 (1.7.7)

Aj1,j2 =

Npup∑k2=−Npup

e−i2πλfvk2

yj2gj1,k2 (1.7.8)

35

for every discrete location in the image plane, (j1, j2). Comparing the two matrix multiplica-

tions in Eq. 1.7.2 to the scalar equations (Eq. 1.7.7, 1.7.8), each element described by gj1,k2

is encoded in the dimensionality of the matrix in G(j1, k2). The final multiplication will

encode the scalar value described by Aj1,j2 in each pixel of the matrix, thereby construct-

ing the two-dimensional Fourier transform. The computational merit of this method is well

described and compared to other numerical methods in [72]. One of the key advantages of

this method over a standard FFT is that the physical dimension of the second plane can

be changed, which is critical to astronomical imaging where the extent of the pupil plane is

orders of magnitude larger than the image plane. The flexibility of the numerical transform

with regard to dimension in Eq. 1.7.2 makes it much simpler to sample the image plane

appropriately.

With regard to choosing the spatial sampling in each plane, there are several factors to

consider. In the image plane, we will seek to normalize to the peak value of the PSF, which is

located at (0, 0). For this reason, the peak value may not be accurately reflected in an even set

of pixels since the exact center is not contained within any one pixel. Thus, an odd number

of pixels is chosen to sample the image plane. In the pupil plane, the choice is the opposite.

In many cases, we seek to exploit symmetry principles to simplify the computation of the

transform as in [15]. Two more practical reasons apply to the computation of the DM shapes

and manufacturing of the pupil. The DMs we use at the Princeton HCIL have an even set of

actuators across the surface which means that the functions it can generate are fundamentally

based on an even set across the aperture. Thus we can more accurately represent their surface

with an even data set. More importantly, real world manufacturing processes define shapes

based on the physical coordinates of a boundary, the precision of which is embedded in

the numerical precision of the coordinates. There is a subtle difference between defining

the aperture of the coronagraph as a set of boundaries (as in the manufacturing process)

and defining the aperture as a two-dimensional set of values (as the design process using

numerical transforms). Defining an odd set in the manner shown in Fig. 1.12(a) artificially

36

oversizes the pupil by a half a pixel in each direction. This can be repaired, but at the

cost of shifting the definition of the center within a particular discrete value. No matter

how you define the odd set, the edge and the center cannot share a common reference. If

one is edge defined, the other is guaranteed to be center defined. To solve this, an even

set of pixels is chosen, but Fig. 1.12(b) demonstrates that this can also be done incorrectly.

In this case the definition of the physical coordinates has resulted in a half pixel shift of

the pupil, which will artificially impose a tip-tilt discrepancy in the phase computed for

the image plane electric field. Fig. 1.12(c) demonstrates the proper definition of physical

coordinates for an even discretization in the pupil plane. This set produces pixels whose

physical coordinates are edge defined. Unlike the odd case both the center and edges of the

pupil are edge defined, dramatically simplifying the physical mapping of the pixel values to

the boundary coordinates required to manufacture the pupil.

In summary, the appropriate method for discretizing these planes is to create an edge-

defined even set for the pupil (or any intermediary plane) and a center-defined odd set in the

final image plane. The matrix representation of the numerical Fourier transform shown in

this section is highly reliable and efficient while maintaining a degree of flexibility necessary

to dimension each plane appropriately for astronomical imaging.

37

(a) Odd Set: Oversized Pupil (b) Bad Even Set: Shifted Pupil

(c) Good Even Set: Edge Defined

Figure 1.12: (a) and (b) represent poor choices for the numerical dimension of a pupil plane,whereas (c) demonstrates the ideal way to define an aperture for producing a coronagraph

38

1.8 Thesis Overview

In this thesis we will use the tools developed in this chapter to develop FPWC algorithms

to correct aberrations from the system optics that degrade the high contrast regions of our

coronagraphic image. By doing so, we will be able to recover small regions of high contrast

where we may once again search for planets. Ch. 2 develops the control laws and strategies

that suppress the aberrated field in both monochromatic and broadband light. To reduce

the number of exposures required for wavefront estimation we also develop an extrapolation

technique that uses a single monochromatic estimate in the broadband wavefront control

algorithm. In Ch. 3 we introduce the DM diversity estimation algorithm, the most common

estimation scheme used for FPWC at any laboratory. It is a batch process estimator that

computes the field via a left pseudo-inverse, which provides least squares minimal error on

the field. The measurements are differential measurements at the image plane with conjugate

DM shapes. We also address how to choose the probe shapes so that the problem is well

posed and we guarantee measurements with high signal-to-noise. In Ch. 4 we develop a new

estimation scheme to replace the DM diversity estimator, utilizing a Kalman filter. The

filter still uses the same probe scheme, but uses fewer measurements since we are able to

close the loop on the state estimate. We demonstrate how this allows us to operate more

efficiently and robustly, since we can rely on measurements taken when the aberrated field

was brighter. Finally we discuss a method by which we can use the control signal itself

to probe the field, enabling closed loop estimation using only a single measurement each

iteration. Ch. 5 and Ch. 6 show the experimental results using these estimation and control

schemes, and then discusses the modeling and experimental limitations that limit our ability

to supress the field. In Ch. 7 we make our final conclusions and point to possible future

directions of this work, particularly in the area of electric field estimation.

As might be surmised in this overview, the challenge of this problem is not just the op-

tics; the level of precision we must adhere to makes this an inherently challenging estimation

and control problem. Beyond that our fundamental measurement of error, the accuracy and

39

precision with which we can manipulate the electric field, is completely unobservable. As

such, our criteria for estimator and controller performance is not simply quantified by the

ultimate achievable contrast, but the efficiency and robustness of the correction algorithm

as well. At any point, this could be limited by the experiment, our model of the exper-

iment from which we derive the estimation and control laws, or the correction algorithm

itself. Improving our efficiency and robustness drives many of the choices we make in the

mathematical development in this thesis because without improvements in all three of these

categories a wavefront control system will never work in a real space observatory.

1.9 Chapter Assumptions

The following assumptions were made in the derivations of this chapter.

§1.5 - From Rayleigh-Sommerfeld to Huygens-Fresnel:

• The paraxial approximation; the magnitude of any propagation vector is equal to that

of the chief ray.

• The on-axis approximation, which assumes that the dot product of any ray with the

chief ray is equal to one.

§1.5.1 - From Huygens-Fresnel to the Fresnel Integral:

• A first order binomial expansion of the propagation distance in the exponential, the

validity of which is determined by Eq. 1.5.16.

§1.5.2 - From Fresnel to Fourier Transform Imaging:

• Infinitely thin optics of infinite extent (or large enough to effectively capture all of the

energy)

• The optic is modeled as only applying a quadratic phase to the field incident on the

optic

40

1.6.1 - Pupil Plane Controllability: Angular Spectrum:

• The phase perturbations at the pth plane upstream of the pupil are small enough that

a first order linearization may be taken in the exponential.

• The pupil aperture is undersized enough compared to the aperture at the pth plane

that its Fresnel propagation (and the shifted Fresnel integral) have no effect on the

field at the pupil plane. In principle this need only be undersized enough to cover the

Fresnel ringing induced by the propagation of the aperture at the pth plane.

1.6.2 - Image Plane Controllability: The Propagation Factor:

• The contribution of each DM perturbation is additive in the image plane.

• For the analytical proof of controllability using a single spatial frequency, we assume

that the control amplitudes are small enough that a first order linearization of each

DM plane is valid and that the DM can be described as a superposition of spatial

frequencies. This does not rule out controllability for larger stroke, in this scenario an

argument was made based on coverage of phase mixing.

• A continuous set of spatial frequencies are controllable up to the controllable limit of

the DM.

41

Chapter 2

Focal Plane Wavefront Control

The goal of any wavefront correction algorithm in high contrast imaging is to reduce the

intensity of the aberrated field to a level that makes a planet detectable. Quantifying a

detection limit is in itself complicated because in addition to the residual aberrated field

we must account for photon and detector noise, background from the exozodiacal light,

and integration time. For simplicity in the control algorithms and in quantifying their

performance we simply quote directly measured values of the normalized intensity relative

to the peak power of the PSF.

In contrast to conventional adaptive optics systems where a non-common path wavefront

sensor is used to control large perturbations in the field in a fast feedback loop, focal plane

wavefront control (FPWC) does not include a wavefront sensor. The non-common path

of the wavefront sensor in conventional AO makes it impossible to accurately measure the

electric field at the science camera to the required level of precision (better than a part in 105

for an Earth-like planet). Extreme-AO systems currently under development will attempt to

calibrate the non-common path to reach contrast levels of 10−6−10−7, but this has yet to be

proven [48, 8]. In order to eliminate all non-common path elements the field at the science

camera must be measured directly. Since the science detector can only measure intensity the

field must be estimated, which is the topic of Chs. 3 and 4. We also seek a control law for the

42

DMs, which are upstream of the pupil (Fig. 1.3), based on the estimated electric field at the

image plane. To do so requires that we map the effect of the DMs on the image plane electric

field, which will rely on the propagation equations derived in Ch. 1. As a result the control

laws derived in this chapter are model based, requiring that we have a good measurement of

critical physical parameters such as aperture sizes, focal lengths, and propagation distances

between critical planes. Most critical is the DM model, which describes the mirror shape as

a function of the applied voltages.

2.1 Monochromatic Wavefront Control

There are a variety of control laws we may choose for this problem. Energy Minimization is

one of the first attempts at a model-based control law; it computes actuator commands that

minimize the intensity in a region of the image plane [49, 11]. This controller proved to be

numerically unstable because the field at the image plane must be inverted to compute the

DM commands. Since this matrix is driven towards zero it comes close to being singular,

making the computed control highly inaccurate. Electric field conjugation regularizes the

inversion by driving the field to a targeted value (typically the theoretical PSF) rather than

attempting to drive the field to zero [23, 24]. This guarantees inversion, but is highly sensitive

to the linearity of the control since it does not regulate the actuator strength. There is also

no guarantee that the targeted field is reachable. Rather than using feedback control in this

manner, we will seek to optimize the control effect in some fashion. Rather than using our

control signal from the DM to minimize the average contrast in our dark hole, IDH , we seek

a solution that minimizes the deformation across the DM’s surface under the constraint that

this achieve our targeted contrast level, 10−C . The algorithm developed by Pueyo et al. [55]

achieves this by minimizing the sum of the squares for all the actuator strengths subject to

the constraint that it achieve a specified average contrast in the area we seek to create a

dark hole. To compute the affect of a DMs control signal and the incident aberrations on the

43

average contrast value in the image plane, we must propagate the resulting pupil plane field

to the image plane via Fourier and Fresnel transformations developed in Ch. 1. Assuming

an aberrated field being added to the nominal field incident on the pupil function, A(u, v),

the pupil plane electric field is given by

Epup(u, v) = A(u, v)(1 + g(u, v))eiφ(u,v), (2.1.1)

where g(u, v) is the complex aberrated electric field and φ(u, v) is the total phase perturbation

induced by the DM. The DM perturbation will be added to a pre-existing (also referred to as

“nominal”) DM shape, φ0(u, v), about which we will ultimately linearize the phase induced

by the DM. Recall that by applying the propagation factor, Eq. 1.6.27, we can account

for the fact that the DMs are not conjugate to the pupil after we have computed IDH .

Assuming that the phase perturbation from the DM, φ, is adequately small we take a first

order approximation of the exponential. Linearizing the pupil plane electric field about the

phase induced by the nominal DM surface, φ0, we find

Epup(u, v) ∼= A(u, v)(1 + g(u, v))eiφ0(1 + i(φ(u, v)− φ0)). (2.1.2)

Additionally, we assume that the product g(φ−φ0) is negligible compared to the other terms

since they will both be small and of approximately the same magnitude. Thus

Epup(u, v) ∼= A(u, v)eiφ0(1 + g(u, v) + i(φ(u, v)− φ0(u, v))). (2.1.3)

If we assume that we are starting from φ0 = 0, we eliminate the exponential making

Epup(u, v) = A(u, v)(1 + g(u, v) + iφ(u, v)). (2.1.4)

44

Applying Eq. 1.5.25, the linearized form of the image plane electric field about φ0(u, v) = 0

is

Eim(x, y) = F{A(u, v)}+ F{A(u, v)g(u, v)}+ iF{A(u, v)φ(u, v)}. (2.1.5)

The linearization has simplified our representation of the image plane electric field by

making the effect of each component additive. Potentially the most useful outcome of this

is the flexibility with which we can account for a non-conjugate DM. In this event, we may

generalize the Fourier transform of φ in Eq. 2.1.5 as a linear operator, C{·}, that includes

the propagation of the DM by applying the propagation factor in Eq. 1.6.27. Knowing this

field, we now compute the image plane intensity to be

Iim(x, y) =

∣∣∣∣C{A(u, v)}+ C{A(u, v)g(u, v)}+ iC{A(u, v)φ(u, v)}∣∣∣∣2. (2.1.6)

We then integrate the image plane intensity over the dark hole to find

IDH =

∫∫DH

|C{A}+ C{Ag}+ iC{Aφ}|2 dxdy. (2.1.7)

With the scalar intenisty, IDH , in place we seek to describe this in a way that will be useful

for control. We first discretize the integral in Eq. 2.1.7 as

IDH =∑i,j

∣∣∣∣C{A}(i,j) + C{Ag}(i,j) + iC{Aφ}(i,j)

∣∣∣∣2∆x∆y, (2.1.8)

where the pair (i, j) describes a discrete point in the two-dimensional plane. The physical

dimension of each point, (∆x,∆y), matches the pixel size of our detector. We now rewrite

the quadratic sum Eq. 2.1.8 as an inner product:

IDH = < C{A}, C{A} > + < C{Ag}, C{Ag} > + < iC{Aφ}, iC{Aφ} > +

2<{< C{A}+ C{Ag}, iC{Aφ} >}+ 2<{< C{A}, C{Ag} >}.(2.1.9)

45

Eq. 2.1.9 now gives us the ability to consider the contribution of each component to the

intensity. The middle three terms will fully describe the interaction of the DM actuation

with the aberrated field. Conventional wisdom says that by design the coronagraph will

achieve the required contrast levels when there are no aberrations. Thus by design, C{A}

should be negligible compared to the aberrated field and the DM actuation. Consequently,

the effect of < C{A}, C{A} >, < C{A}, C{Ag} >, and < C{A}, iC{Aφ} > on the value of

IDH are negligible compared to the other three terms in Eq. 2.1.9. Here we must note that

efforts are being made to improve coronagraph performance (e.g. IWA and throughput) by

relaxing the contrast level of the nominal PSF to be equal with the amplitude aberrations

present in the system [39]. In this case, the DMs are being used to reach contrast levels

below the PSF (being equivalent to pupil mapping [57, 58]) which requires that this term

be accounted for in the controller. As will be shown in Ch. 3, the estimator includes the

nominal PSF in the state estimate of the image plane electric field. If we compute the scalar

contrast value from a single image this will also include the contribution of the nominal PSF

in the measurement. Thus, we do in fact account for the contribution of the nominal field

in the cost function and the controller will be capable of suppressing beyond the nominal

value of the PSF. Reordering Eq. 2.1.9 so that each term reflects a measureable quantity in

the laboratory, the intensity to be used for our cost function is

IDH =< C{Aφ}, C{Aφ} > +2<{< C{A(1 + g)}, iC{Aφ} >}

+ < C{A(1 + g)}, C{A(1 + g)} > .

(2.1.10)

Now we impose a matrix inner product so that we can describe the electric field and the

DM control effect, our state and control variables, as column matrices. These are formed

by stacking the columns of the two-dimensional fields to create a single column matrix, an

approach used throughout this thesis. The state in our control algorithm is the column

matrix describing the aberrated field at the image plane, C{Ag}. However, we have yet to

46

parameterize the DM commands into a control matrix, u. At the moment we could solve

for C{Aφ} but what we seek are actuation commands for the DM, not its field at the image

plane. One control approach might be to solve for C{Aφ}, compute the inverse transform,

and use an arbitrary physical model of the the DM to compute the actuator strengths (which

relates the surface height to the voltage commands). This requires real time transformations

during the control loop, which will make the algorithm slower and more reliant on a lot of

computational power. Instead, we will introduce a physical model of the DM so that the

optimization can directly solve for the amplitude of each actuator on the DM. This will put

less demand on the computer, and each control step will be faster. Letting H(x, y) be the

height of the DM surface, the resulting phase perturbation induced by the DM, φ(u, v), is

φ(u, v) =2π

λ0

H(x, y). (2.1.11)

For control, we wish to describe DM surface height, H(x, y), as a combination of the two-

dimensional height maps imposed by each actuator. Since we are using a DM with a con-

tinuous face sheet, the contribution of any actuator will be highly localized but will still

deform the entire DM surface. As a result, we must describe the contribution of the qth

actuator as a two-dimensional phase map over the entire plane of the DM surface, hq(x, y).

The continuous membrane means that the combination of all actuators is nonlinear, and very

complicated to compute [9]. However, we will show later that we operate in an extremely low

stroke regime. We will show in Ch. 6 that even with actuation levels nearly 4 times larger

than our peak-to-valley actuator commands, the combination of actuators is close to linear

(Fig. 6.2). Thus, we can describe H(x, y) as a superposition of hq(x, y) over all actuators,

Nact. The phase contribution of the DM can then be described as

φ(u, v) =2π

λ0

Nact∑q=1

hq(u, v). (2.1.12)

Finally, we wish to make our control matrix, u, a column matrix made up of the control

47

signal from each actuator, uq. To do so we describe hq(u, v) as a characteristic shape with

unitary amplitude, commonly referred to as an influence function, fq(u, v). To find hq(u, v)

we simply multiply fq(u, v) by the control amplitude, aq. Describing hq(u, v) with influence

functions, the phase perturbation induced by the DM is

φ(u, v) =2π

λ0

Nact∑q=1

aqfq(u, v), (2.1.13)

which sums the qth 2-D phase map, or influence function fq(u, v), for all Nact actuators to

reconstruct φ(u, v). The strength of each influence function is determined by aq.

Substituting Eq. 2.1.13 into Eq. 2.1.10, we write C{Aφ} = C{Af}u, where u is a column

matrix of actuator strengths, u = [a1 . . . ak]T , and f is a matrix describing the perturbation

of each influence function, fq, at the pupil. In this formulation, we are specifically applying

a matrix inner product and C{Af} can be written as a matrix of dimension Npix×Nact. To

simplify the notation we define this matrix as

G = C{Af} ∈ [Npix ×Nact]. (2.1.14)

This allows us to write

< C{Aφ}, C{Aφ} >= uTG∗Gu. (2.1.15)

Applying the matrix form of the control amplitudes to Eq. 2.1.10 we find

IDH(λ0) =4π2

λ20

uTM0u+4π

λ0

uT={b0}+ d0. (2.1.16)

48

Where

M0 =< C{Af}, C{Af} >= G∗G (2.1.17)

b0 =< C{A(1 + g)}, C{Af} >= G∗C{A(1 + g)} (2.1.18)

d0 =< C{A(1 + g)}, C{A(1 + g)} >= C{A(1 + g)}∗C{A(1 + g)}. (2.1.19)

Conceptually, d0 is the column matrix of the intensity contribution from the aberrated field,

b is a matrix representing the interaction of the DM electric field with the aberrated field,

and M describes the additive contribution of the DM to the image plane intensity. Having

represented IDH in a quadratic form with regard to a control matrix, we can use Eq. 2.1.16

to produce an optimal control strategy. Recalling that the targeted contrast is 10−C , the

optimization problem in monochromatic light is stated as

minimizeN∑k=1

a2k = uTu

subject to IDH(λ0) ≤ 10−C .

(2.1.20)

To solve the optimization problem we create a cost function, J . Incorporating the constraint

for the central wavelength into the minimization via a Lagrange multiplier, µ0, yielding

J = uTu+ µ0(IDH − 10−C)

= uTu+ µ0

(4π2

λ20

uTM0u+4π

λ0

uT={b0}+ d0 − 10−C)

J = uT(I + µ0

4π2

λ20

M0

)u+ µ0

4π

λ0

uT={b0}+ µ0

(d0 − 10−C

). (2.1.21)

The cost function is quadratic in form, guaranteeing a single minimum. Recognizing that

M = MT , we take the partial derivative to find actuation that minimizes the cost function.

Evaluating

∂J

∂uT

∣∣∣∣uopt

= 2

(I + µ0

4π2

λ20

M

)uopt + µ0

4π

λ0

={b} = 0 (2.1.22)

49

and solving for for the optimal control input, we find

uopt = −µ0

(λ0

2πI + µ0

2π

λ0

M0

)−1

={b0}. (2.1.23)

To find the value of uopt, all that is left is to find the value of µ0 that minimizes the cost

function, Eq. 2.1.21. This is typically done via a line search on µ0, evaluating u with

Eq. 2.1.23 for each value of µ0 until Eq. 2.1.21 reaches a minimum. Pueyo et al. [55] have

rigorously shown that this optimization is in fact a quadratic subprogram to the full nonlinear

problem. They have also shown that for a single iteration, the controller is guaranteed to

achieve the targeted contrast level, 10−C , provided the electric field is known perfectly and

the DM control magnitude is small enough that it remains within the bounds of the current

linearization. Since the sub-program is convex, we can reach its global minimum if we re-

linearize about the new DM shape at each iteration of the correction algorithm (or every

time we apply a new DM command). This is computationally expensive, so we tend not to

do this in the experiment. However the nature of the controller is such that the solution

will not deviate dramatically from the optimization we get by re-linearizing each time. Since

the controller is trying to minimize stroke, the control tends to remain in the regime of a

particular linearization for as long as possible. Additionally, the magnitude of actuation is

proportional to the contrast it is trying to suppress. Since each control step is operating

on lower contrast levels the actuation magnitude will also tend to decrease, making the

deviation from the last control shape smaller with each iteration. As an example, Fig. 2.1

shows the evolution of the mode, median, and absolute peak to valley deformation for DM1

and DM2 as a function of the control history. The vast majority of the stroke is used to

eliminate strong abberations in the first 5 iterations. After this, the extrema and mode

reduce dramatically, frequently a factor of 10 less stroke than the first iteration. Throughout

the entire control history the median never deviates far from zero, meaning that a DC drift

never develops. Perhaps the best balance is to re-linearize when the contrast performance is

50

5 10 15 20 25 30−4

−2

0

2

4

6

DM1 Actuation Characteristics

Iteration

HeightChange(n

m)

Mode

Median

Max − Min

(a) DM1 Actuation Per Iteration

5 10 15 20 25 30

−5

0

5

10

DM2 Actuation Characteristics

Iteration

HeightChange(n

m)

Mode

Median

Max − Min

(b) DM2 Actuation Per Iteration

Figure 2.1: Time history of the mode, median, and peak-to-valley actuation levels for DM1and DM2. The starting contrast is 1 × 10−4 and final contrast is 2.3 × 10−7 at the 30th

iteration of the control algorithm.

poor (requiring large stroke), and discontinue re-linearizing as the contrast improves (since

the second order term neglected in Eq. 2.1.3 will become less significant). Neither of these

techniques guarantees that we reach the global optimum for the subprogram, but for a space

telescope it is probably worth the computational savings since it will not deviate significantly.

Pueyo et al. [55] also shows how to account for multiple DMs at non-conjugate planes. As

shown in §1.6.2, using two DMs in planes non-conjugate to the pupil we can take advantage

of their relative propagation to produce phase induced amplitude distributions at the pupil.

This makes both the real and imaginary parts of the image plane controllable, allowing

the creation of symmetric dark holes about the PSF. Thus, we can generate a dark hole

within the entire search area made available by the coronagraph. By virtue of the linearity

and small value approximations made to reach Eq. 2.1.4 and Eq. 2.1.5, the affect is purely

additive since the cross-terms, or cross-talk, between the DMs will be negligible. Including

the propagation factors from Eq. 1.6.27 into the transform for DM1, C1{·}, and DM2, C2{·},

51

the image plane electric field with two DMs in series is

Eim(x, y) = F{A(u, v)(1 + g(u, v))}+ iC1{A(u, v)φ1}+ iC2{A(u, v)φ2}. (2.1.24)

Applying the inner produce, we find the intensity at the image plane to be

IDH = < C1{Aφ1}, C1{Aφ1} > + < C2{Aφ2}, C2{Aφ2} >

+ < C1{Aφ1}, C2{Aφ2} > + < C2{Aφ2}, C1{Aφ1} >

+ 2<{< C{A(1 + g)}, iC1{Aφ1} >}+ 2<{< C{A(1 + g)}, iC2{Aφ2} >}

+ < C{A(1 + g)}, C{A(1 + g)} > .

(2.1.25)

Since IDH is a scalar value, we can maintain the same form for Eq. 2.1.16 and Eq. 2.1.21

by simply augmenting the control matrix, u = [uDM1 uDM2 . . . uDMi]T . Using the same

superposition principle and influence function set for the second DM, we define their control

effect matrices, GDM1 and GDM2, as

GDM1 = C1{Af1} (2.1.26)

GDM2 = C2{Af2}. (2.1.27)

With two DMs, the matrices in the control law given by Eq. 2.1.21 and Eq. 2.1.23 now take

the form

M0 =

G∗DM1GDM1 G∗DM1GDM2

G∗DM2GDM1 G∗DM2GDM2

(2.1.28)

b0 =

G∗DM1C{A(1 + g)}

G∗DM2C{A(1 + g)}

(2.1.29)

d0 = C{A(1 + g)}∗C{A(1 + g)}. (2.1.30)

52

Thus, the computation remains the same regardless of the number of DMs. The only change

is in the dimension of u, M0, and b0, which encode the different propagation factors for each

DM. Using this control law is not equivalent to other multi-DM concepts such as multi-

conjugate AO (MCAO) [52], woofer-tweeter concepts [48], or wide field AO [61]. As was

shown in §1.6, it is the different propagation distances that gives two DMs in series their

power in the control algorithm. It is also worth mentioning that in the event of a DM failure

this provides redundancy in the mission, mitigating the risk involved in flying an unproven

technology in space.

Having developed a monochromatic wavefront control algorithm we now have confidence

in its controllability with regard to both amplitude and phase aberrations in the image plane.

Once we are provided with an electric field measurement we may solve the optimal control

problem to suppress the field below a specified level on both sides of the image plane.

2.2 Wavelength Dependence of the Image Plane

The Stroke Minimization algorithm in §2.1 only operates on a single wavelength, λ0, which

means that suppression of the field is neither optimal nor guaranteed over a bandwidth.

Pueyo [59] showed through simulation that the bandwidth must be less than ≈ 1 − 2% of

the central wavelength. Practically, we can also define a “single” wavelength by our image

plane resolution, requiring that the difference in plate scale between the maximum and min-

imum wavelengths is less than a pixel (≈ 0.1 pixel width). However, the primary purpose

of directly imaging an exoplanet is to measure its spectra. Using the monochromatic con-

trol algorithm, we would have to correct for the aberrations at each wavelength separately

to obtain a full spectrum of the planet, which would take prohibitively long. Instead, we

would like to make the control algorithm effective over a bandwidth. This has the potential

to improve the efficiency of spectral characterization and enables detection in a broadband

image, critical in a photon limited system. Moving to broadband algorithms requires that

53

we have a good understanding of the nature of coherence from starlight. We already un-

derstand that the star is spatially coherent because of the large propagation distances; the

curvature of the wavefronts from the star are so large and the “point sources” from different

locations on the star are so close that we effectively image parallel, plane wavefronts. We

also understand that the light from a star has extremely short coherence length, requiring

equal path interferometers to interfere the light with itself. What we must address is how

to integrate in wavelength to compute the broadband image. Knowing that the emission of

a star is effectively from random radiators (in that they are not in phase with one another)

we can assume that each wavelength will not interfere with the other. As a result we may

integrate the intensity over wavelength to compute what our broadband image should be,

making it very simple to augment the optimization problem to accommodate the additional

wavelengths. Eq. 2.1.16 shows which terms are dependent for the specified wavelength in

monochromatic light, λ0. For arbitrary wavelength, λ, the dark hole intensity is

IDH(λ) = w(λ)4π2

λ2uTMλu+ w(λ)

4π

λuT={bλ}+ w(λ)dλ. (2.2.1)

We have included a normalization function, w(λ), in Eq. 2.2.1 to account for the fact that

the relative intensity of each wavelength will vary. To simplify the normalization, w(λ) is

defined so that w(λ0) = 1, where λ0 will be centered in the control bandwidth, ∆λ. It is

also important to note that in addition to the chromaticity of the coronagraph (if any) and

aberrations, Mλ, bλ, and dλ vary in wavelength because of the transform, C{·}, making

Mλ = < Gλ, Gλ > (2.2.2)

bλ = < Cλ{A(1 + gλ)}, Gλ > (2.2.3)

dλ = < Cλ{Agλ}, Cλ{Agλ} >, (2.2.4)

where Mλ is simply the transformation from u to image plane intensity. Every control effect

matrix, Mλ, may be precomputed for each wavelength, assuming it remains within the linear

54

regime of the DM shape. However, bλ requires a measurement of the aberrated field to be

computed. Since dλ is the intensity distribution of the aberrated field, this simply requires

an exposure of the aberrated field at that wavelength. With bλ and dλ requiring an estimate

of the current state, we will require more exposures if the field is to be evaluated at multiple

wavelengths.

2.3 Continuous Bandwidth Constraint

Knowing the wavelength dependence of the electric field in the image plane, we now seek

an optimization that suppresses the field to a specified contrast level over a bandwidth, ∆λ,

centered about our central wavelength, λ0. Thus, our statement of the problem becomes

minimizeNact∑k=1

a2k = uTu

subject to1

∆λ

λ0+∆λ/2∫λ0−∆λ/2

IDH(λ) dλ ≤ 10−C .

(2.3.1)

There are two problems with this formulation. First, we have produced a numerically in-

tractable solution, where the numerical minimization requires that a continuous integral be

evaluated many times. This also requires an electric field measurement (still assumed to

be perfectly known) over the full integral. Since the functional dependence in wavelength is

unknown (despite our assumption to this point that the field is provided to us) this drives the

number of required field measurements to infinity. Thus, the optimization problem should

be solved for a discrete set of n wavelengths, changing the optimization problem to

minimizeNact∑k=1

a2k = uTu

subject to 1Nλ

Nλ∑i=1

IDH(λi) ≤ 10−C

(2.3.2)

55

In this formulation there is an implicit assumption that if we constrain a discrete set of

wavelengths to fall under a targeted contrast value this will correspond to suppression of all

wavelengths between them. For example, if we are provided two monochromatic estimates

that bound a bandwidth we can find the cost function for Eq. 2.3.2 in the same manner as

§2.1 to find a set of DM commands that suppress those two wavelengths. However, there is no

guarantee that when we image the entire bandwidth we will maintain the targeted contrast

level. We must discretize the wavelengths in the optimization but we lose our ability to

guarantee suppression over the full bandwidth. In the end, successfully suppressing the

field depends upon the size of the band and the number of discrete wavelengths chosen. To

help guarantee suppression between the optimized wavelengths, we appeal to the concept of

maintaining small phase shifts so that there are no dramatic changes in the wavefront. For

any two wavelengths in the optimization, (λ1, λ2), the bandwidth between them should be

small enough that λ2 < 2λ1. Since we target bandwidths of ∆λ/λ0 = 10 − 20% there isn’t

a dramatic shift in the relative phase for these wavelengths, and our expectation is that the

contrast should be maintained across the entire band.

Looking further, the optimization does not guarantee a particular contrast level for each

wavelength, only that their sum be below 10−C . In other words, formulating the problem in

this way does not give us the freedom to weight the contribution of each wavelength to the

optimization. For characterization we need the speckles to be suppressed equally well over

all wavelengths, but suppression at one wavelength may be more important than at others.

For example, if we bound a spectral feature that is much dimmer we would require higher

contrast at this wavelength. While Eq. 2.3.2 is undoubtably the optimal solution with regard

to suppressing a bandwidth below a particular value, it will not necessarily guarantee our

ability to obtain a spectral measurement. Therefore we will continue to further constrain

the problem to get the desired properties out of the controller.

56

2.4 Windowed Stroke Minimization

As discussed in section 2.3 we seek to make the problem of correcting over a bandwidth

computationally tractable by discretizing the integral in Eq. 2.3.1 to a summation of finite

wavelengths. We cannot avoid the problem of guaranteeing suppression between wavelengths,

but we can try to guarantee suppression of each wavelength in Eq. 2.3.2 by using multiple

constraints in the optimization. Rather than simply summing the intensities we will impose

a separate constraint for each wavelength. In this formulation, we will choose three wave-

lengths, one at the center of the bandwidth, λ0, and two more providing the boundaries for

the problem, λ1, λ2, to define a window over which the correction will be made. Applying

three separate constraints, the optimization becomes

minimizeNact∑k=1

a2k = uTu

subject to: IDH(λ0) ≤ 10−Cλ0 ,

IDH(λ1) ≤ 10−Cλ1 ,

IDH(λ2) ≤ 10−Cλ2

where λ1 = γ1λ0

λ2 = γ2λ0.

(2.4.1)

Now we will find the optimal control law by augmenting the minimization with three La-

grange multipliers for each discrete value of the intensity. Following the same procedure as

57

in §2.1, we write the cost function as

J =uTu+ µ0(IDH(λ0)− 10−Cλ0 ) + µ1(IDH(λ1)− 10−Cλ1 ) + µ2(IDH(λ2)− 10−Cλ2 )

J =u

[I +

4π2

λ20

(µ0Mλ0 + µ1

w(λ1)

γ21

Mλ1 + µ2w(λ2)

γ22

Mλ2

)]uT

+4π

λ0

[µ0={bλ0}+ µ1

w(λ1)

γ1

={bλ1}+ µ2w(λ2)

γ2

={bλ2}]uT

+[µ0

(dλ0 − 10−Cλ0

)+ µ1w(λ1)

(dλ1 − 10−Cλ1

)+ µ2w(λ2)

(dλ2 − 10−Cλ2

)]. (2.4.2)

Taking the partial derivative of the resulting cost function yields the optimal DM command

for a subset of wavelengths spanning the entire bandwidth ∆λ. As before, the optimal

command is determined by performing a line search on µ. We now have an optimization

across three variables, complicating the task of minimizing the function. We still have the

same problem as in §2.3 that the globally optimal solution in the three dimensional space

(µ0, µ1, µ2) does not necessarily guarantee the targeted suppression at all three wavelengths.

However, with three Lagrange multipliers we can guarantee suppression of all three wave-

lengths by restricting the search to a single dimension. We write the Lagrange multipliers of

the two bounding wavelengths as weighted values of the first so that µ1 = δ1µ0 and µ2 = δ2µ0.

Applying this relationship, Eq. 2.4.2 becomes

J =u

[I + µ0

4π2

λ20

(Mλ0 + δ1

w(λ1)

γ21

Mλ1 + δ2w(λ2)

γ22

Mλ2

)]uT

+ µ04π

λ0

[={bλ0}+ δ1

w(λ1)

γ1

={bλ1}+ δ2w(λ2)

γ2

={bλ2}]uT

+ µ0

[(dλ0 − 10−Cλ0

)+ δ1w(λ1)

(dλ1 − 10−Cλ1

)+ δ2w(λ2)

(dλ2 − 10−Cλ2

)]. (2.4.3)

58

Taking the partial derivative and evaluating at zero,

∂J

∂uT

∣∣∣∣uopt

=0

=uopt

[2 I + 2µ0

4π2

λ20

(Mλ0 + δ1

w(λ1)

γ21

Mλ1 + δ2w(λ2)

γ22

Mλ2

)]+ µ0

4π

λ0

[={bλ0}+ δ1

w(λ1)

γ1

={bλ1}+ δ2w(λ2)

γ2

={bλ2}], (2.4.4)

we can use the value of µ0 that minimizes Eq. 2.4.3 to compute the following optimal com-

mand:

uopt =− µ0

[={bλ0}+ δ1

w(λ1)

γ1

={bλ1}+ δ2w(λ2)

γ2

={bλ2}]·[

λ0

2πI + µ0

2π

λ0

(Mλ0 + δ1

w(λ1)

γ21

Mλ1 + δ2w(λ2)

γ22

Mλ2

)]−1

. (2.4.5)

By parameterizing the three Lagrange multipliers we can weigh their effect on the cost

function, thus allowing us to control the degree to which each constraint is satisfied. If

we choose δ1 = δ2 = 1 we have made each contrast target equally important, making the

problem equivalent to solving Eq. 2.3.2. We can also control the degree to which achieving

the bandwidth affects the optimization. If we need a very soft correction outside of the

central wavelength then we can choose δ1 and δ2 to be less than one. We may also find

that we need to preferentially weight one side of the bandwidth to accommodate variance in

absorption and emission from the planet.

Eq. 2.4.3 and Eq. 2.4.5 have simplified the problem of optimal broadband correction by

writing a cost function with a single Lagrange multiplier, while leaving the degree of freedom

to weight the required performance of each wavelength. This parameterization constrains

the path of the original 3D optimization to lie along a vector. The direction of the vector

is arbitrary, set by the user based on the values chosen for (δ1, δ2). To evaluate this cost

function we must account for the wavelength dependence of each matrix, which triples our

59

computational cost. The matrices describing the impact of the DMs on the electric field,

Mλ, can be precomputed since this is simply a linear map from unitary DM actuation to

image plane intensity. The only difference between the three is the wavelength used in the

transform, Cλ{·}. They only need to be re-evaluated when the system is re-linearized. Since

bλ and dλ require a measurement of the current electric field, this cannot be pre-computed

and must be measured for each wavelength.

Practically, a windowed optimization allows us to seek commands to suppress over a

given filter set in the instrument. In the most conservative case the bounding wavelengths

would be the edges of the filter. In a more aggressive control mode the window could be

chosen to span the smallest and largest wavelengths across a set of filters. The disadvantage

of this approach is that it does not solve the problem of guaranteeing correction between

the intermediate wavelengths. This does, however provide the freedom to arbitrarily weight

the contrast performance separately for each wavelength. This degree of freedom is actually

rather useful, since the relative intensity of the planet to its parent star is a function of

wavelength. For example, in the visible spectrum Earth is 10−10 times dimmer than the sun

but in the infrared it is only 10−6 times dimmer, and Des Marais et al. [17] show that we

can expect order of magnitude fluctuations in the reflectance of a terrestrial body with an

atmosphere within a ≈20% band. Thus we can use our weighting values (δ1, δ2) to relax the

DM commands with respect to wavelengths that we do not expect to require such stringent

contrast levels, purely based on the blackbody spectrum of the parent star. With regard

to the control problem, we will find in the following sections that there is a significant

amount of error introduced in the estimates at bounding wavelengths, regardless of whether

they are provided by a direct estimate or an extrapolation method (§ 2.5). The ability to

underweight the bounding wavelengths in the cost function gives us the ability to soften

the effect of those errors. In fact, we often found the best performance from the results in

Ch. 5 when the bounding wavelengths were slightly underweighted. Noting this behavior,

an interesting adaptive control scheme would be to modulate δ1 and δ2 based on an estimate

60

of the error introduced by the extrapolated fields at the bounding wavelengths. This could

go so far as using these values to adjust a the functional form of the extrapolation that we

will derive in §2.5.

2.5 Extrapolating Estimates in Wavelength

The Windowed Stroke Minimization algorithm of §2.4 solves the bandwidth problem, but

its implementation can be quite complicated because its wavelength dependence requires

multiple transforms and field estimates. The wavelength dependent matrices, bλ and dλ,

represent the component of the electric field from coupling between aberrations and the

DM-induced perturbations and the intensity distribution of the aberrated field respectively.

Computing both requires an estimate of the electric field at each iteration of the quadratic

subprogram. In FPWC the estimate is what drives the correction time, not the controller.

If we could measure the field directly, one iteration of the correction algorithm would only

require one image per wavelength to measure the field plus one more to see the control effect.

However, as will be discussed in Ch. 3, the electric field cannot be measured directly but

must be estimated. This involves taking many exposures to estimate the field (the number

of which is a major topic in Ch. 3), between two and eight exposures per wavelength. Once

provided with an electric field estimate, we still only require one image to measure the

control effect. As such, we only save one exposure per wavelength (two for Windowed Stroke

Minimization) if we have to directly estimate each field. Thus we gain very little in controller

performance by going to such a complicated algorithm. In terms of correction efficiency we

would almost be better off correcting each wavelength individually using the monochromatic

control law in §2.1.

If we can eliminate our need to estimate every wavelength, we can reduce the number

of exposures per iteration by as much as ≈ 60%. We still require a field estimate at a

single wavelength, but from that we will attempt to extrapolate what the field is at other

61

wavelengths. To do this, we will first make some assumptions about the electric field to

write a functional relationship describing how the aberrated field evolves in wavelength as

it deviates from the estimate at the central wavelength, λ0. To include the wavelength

dependence of the transform, Cλ{·}, we will characterize the variance of the aberrations at

the pupil plane. An arbitrary aberrated field at the pupil may be described as

Epup,abr = α(u, v, λ)eiβ(u,v,λ). (2.5.1)

For the sake of computational efficiency, we will describe the functional form of α(u, v, λ)

and β(u, v, λ) by assuming that the errors are induced by optics and that these errors are

effectively located at the pupil plane. Uniform amplitude variation in wavelength (such

as the change in reflectivity of a mirror as a function of wavelength) will be absorbed by

the normalization of the field, since it is analagous to intensity fluctuation in wavelength.

Thus α(u, v, λ) describes the spatial variation of amplitude across the pupil as a function

of wavelength. If we assume the amplitude variations due to system optics are analogous

to reflectivity variations, i.e coating errors across the surface of the mirror, the amplitude

aberrations become independent of wavelength, making Eq. 2.5.1

Epup,abr = α(u, v)eiβ(u,v,λ). (2.5.2)

This assumption does not necessarily hold In the presence of reflectivity variations that exist

at non-conjugate planes because of phase induced amplitude errors that arrive at the pupil.

We acknowledge this as a limitation, but we will keep the assumption to maintain a simple

functional relationship since this computation must be made during the control loop. Our

task will be to see how effective this assumption is in the experiment.

Moving to β(u, v, λ), we assume that the phase errors are from shaping errors within

the system optics. We assume that these errors are exist at the pupil plane. Assuming a

particular height perturbation to the shape of he optic, h(u, v), we can write the phase errors

62

β as

β(u, v, λ) =2π

λh(u, v). (2.5.3)

This means that the phase errors in the pupil plane are inversely proportional to wavelength.

This makes intuitive sense since a fixed perturbation induced by an optic applies a smaller

phase disturbance as the wavelength increases. Defining the incident phase by our estimated

wavelength, λ0, as

β0(u, v) =2π

λ0

h(u, v). (2.5.4)

We can rewrite Eq. 2.5.3 as a function of β0(u, v), λ, and λ0. The phase perturbation at the

pupil then becomes

β(u, v, λ) = f(β0(u, v), λ)

=λ0

λβ0(u, v). (2.5.5)

Applying Eeq. 2.5.5 and Eq. 2.5.2 to Eq. 2.5.1, we find that the wavelength dependent

aberrations at the pupil can be approimated as

Epup,abr(u, v, λ0) = α(u, v)eiλ0λβ0(u,v). (2.5.6)

Assuming an estimate of the electric field at the image plane, Eest(x, y, λ0), that is only

from pupil plane aberrations, our estimate of the pupil plane aberrations is given by

g0(u, v) = F−1λ0{Eest(x, y, λ0)}. (2.5.7)

We now equate g0(u, v) to Eq. 2.5.6, making the phase of the pupil estimate

eiβ0 =g0

α

iβ0 = ln(g0

α

). (2.5.8)

63

Applying Eq. 2.5.5, we shift the phase found in Eq. 2.5.8 to get

iβλ = iλ0

λβ0

=λ0

λln(

g0

α)

α(u, v)eiβλ = αeλ0λ

ln(g0α

)

= αeλ0λ

(ln(g0)−ln(α))

= α1−λ0λ g

λ0λ

0

α(u, v)eiβλ =gλ0λ

0

|g0|λ0λ−1. (2.5.9)

Reapplying the linear transform for the new wavelength, Fλ{·}, we compute the extrapolated

field from λ0 to λ to be

Eextrap(x, y, λ) = Fλ{F−1λ0{Eest(x, y, λ0)}λ0

λ

|F−1λ0{Eest(x, y, λ0)}|λ0

λ−1.

}(2.5.10)

Using Eq. 2.5.10 we now having the ability to extrapolate an estimate made at λ0 to bounding

wavelengths. To minimize the bandwidth between estimates, we choose to estimate at the

central wavelength and extrapolate the field for the bounding wavelengths. In applying

the extrapolation, we run into a numerical complication. Since the estimate is finite, the

inverse transform of its shape will convolve with the field we seek, imposing itself on the

extrapolation. Fortunately the area being estimated and controlled is typically smaller than

the image, so we can mitigate the effect by filling in the unknown area with the square

root of the intensity found in the image measuring the control effect. Not knowing the

phase, we are only adding partial information in this region but it serves to soften the

effect of the finite area on the extrappolation. Fig. 2.2 shows an example of an estimate

extrapolation over a ∆λ = 10% bandwidth. Overlaid on the images are boxes defining the

estimation area, inside of which a complex field is provided in Fig. 2.2(b). We then use

Eq. 2.5.10 to compute the field at the bounding wavelengths, Fig. 2.2(a) and Fig. 2.2(c).

64

λ0/D

λ0/D

Lower Wavelength

−10 −5 0 5 10

−10

−5

0

5

10 −6

−5.5

−5

−4.5

−4

(a) |Eextrap(λ1)|2λ0/D

λ0/D

Central Wavelength

−10 −5 0 5 10

−10

−5

0

5

10 −6

−5.5

−5

−4.5

−4

(b) |Eest(λ0)|2λ0/D

λ0/D

Upper Wavelength

−10 −5 0 5 10

−10

−5

0

5

10 −6

−5.5

−5

−4.5

−4

(c) |Eextrap(λ2)|2

Figure 2.2: Example of wavelength extrapolation using Eq. 2.5.10 in a bandwidth of ∆λ =10%. The central wavelength(b) is used to extrapolate the field at a wavelength at thebottom (a) and top (c) of the window. The evolution of the aberrations is more complicatedthan a physical scaling law.

The extrapolated field estimates for the upper and lower wavelengths are then taken from

the area inside the boxes that defines the correction area. At this stage, it is possible to

mitigate our uncertainty in the phase outside of the estimation area with a Gerchberg-

Saxton like algorithm. We would recursively transform to each wavelength, replacing the

areas outside of the estimate with the newly computed complex field. However, the search

area defined by the image plane mask and the camera resolution may limit the accuracy

of a Gerchberg-Saxton loop. These algorithms have been shown to require upwards of 200

cycles, each involving multiple 2D Fourier transforms, and are very costly with regard to

computation time [40]. Additionally, the level of accuracy gained by such a technique is lost

by the time evolution of the aberrations. This would be a function of computational power

and the time scale of the aberrations. In a space observatory the computational power is

low and on a ground telescope the speckles evolve quickly, making speckle evolution a very

real possibility in either scenario. The idea is certainly worth pursuing in a highly stable

laboratory environment to test if it can improve performance, but is unlikely to be of benefit

as a true observation mode.

By extrapolating (bλ1 , dλ1), (bλ2 , dλ2) Eq. 2.5.10 provides all the necessary information for

the Windowed Stroke Minimization algorithm, Eq. 2.4.3 and Eq. 2.4.5. We can now attempt

65

broadband suppression using only a single monochromatic estimate. However, the simpli-

fication made for both amplitude and phase requires that errors in non-conjugate planes

have a negligible effect on the aberrations in the pupil. Reflectivity variation across any

mirror is generally very low since chemical vapor deposition is a very stable and reliable

process. These errors are typically so low that they only become a limiting factor in extreme

interferometric problems where the null must be very deep, such as the visible nuller coro-

nagraph [46]. In this case the variations in amplitude from non-conjugate surfaces, such as

the DMs (Fig. 1.3), will be negligible and Eq. 2.5.5 will be relatively accurate. However, by

virtue of the fact that we use two DMs to correct amplitude via the propagation of phase

deformations we cannot say the same for the assumption made on amplitude. The phase

induced amplitude aberrations from DM1 and DM2 are significant due to the large nominal

phase errors present on these surfaces. Accounting for these errors will add higher order

wavelength dependence to α(u, v, λ), complicating the form of the transformation we found

in Eq. 2.5.10. Since the spatial frequencies of these aberrations are mostly of very high order

we will continue with our original assumption and hope that this error does not contribute

significantly at the low spatial frequencies we are considering.


The following assumptions were made in the derivations of this chapter.

§2.1 - Monochromatic Wavefront Control:

• Linear approximation made for the DM field.

• The control effect relies on the angular spectrum factor when a DM is non-conjugate

to the pupil.

• g(u, v) and φ(u, v) are both small and the product gφ is negligible.

• For clarity, the control laws are written in a form that assumes φ0 = 0.

66

• By design, < C{A}, C{A} >, < C{A}, C{Ag} >, and < C{A}, iC{Aφ} > are negligible.

• The DM response is small enough that its response to voltage is linear, and superpo-

sition of influence functions holds.

• Re-linearization of the control matrices is not necessary at each control step because

of the rapid reduction of actuation levels during control.

• The effect of multiple DMs is additive, and symptom of the first three assumptions.

§2.3,§2.4,§2.5 - Broadband Wavefront Control and Extrapolation:

• The bandwidth, ∆λ/λ0, is small enough that wavelengths inside those constraining the

controller will also be suppressed.

• Amplitude variations are wavelength independent and fixed to the pupil plane.

• Phase variations scale as λ0/λ and are fixed to the pupil plane.

67

Chapter 3

Batch Process Electric Field

Estimation

The control algorithms developed in Ch. 2 require that we provide an estimate of the electric

field at the image plane. The required level of accuracy and precision that focal plane

wavefront correction requires from the electric field estimate generally precludes using a

separate wavefront sensor because it introduces non-common path errors. However, the

final science camera is only capable of imaging the magnitude squared of the electric field.

All phase information of the complex field is lost. Therefore we must modulate the field

in some manner to make both amplitude and phase observable at the science detector.

To accomplish this, there are generally two levels of algorithms that have been developed.

Nonlinear estimation schemes based on Gerchberg-Saxton algorithms can accommodate large

phase deformations (many multiples of the wavelength), but their uncertainty is too large

for the controllers of Ch. 2 to reach extremely high levels of contrast [40, 20, 10, 60, 3].

To create dark holes in high contrast images we require the second type, high precision

estimation schemes that only operate in the regime of small phase perturbations. One

approach is to image multiple planes and converge on the estimate using a more precise

version of a Gerchberg-Saxton type algorithm [40, 18, 19, 22]. Another approach is to use

68

algorithms that modulate the aberrations with the deformable mirror itself. This requires

a model accurate and precise enough to predict the DM’s effect on the image plane at

intensity levels equal to or lower than our desired contrast level, and must be measured

fast enough for the control to be effective. The speed is directly tied to the stability of the

field, which is in turn dependent on the instrument stability. As will be shown in §6.5, the

stability must be quantified as a function of the contrast and we will see that this affects the

performance of our broadband experiments in Ch. 5. For precision estimation, we have used

the DM-Diversity estimation scheme as a baseline to provide the electric field estimate to the

stroke minimization controller [11, 23] because of its widespread use and success in multiple

laboratories [7, 24, 28]. This chapter derives the algorithm and addresses its advantages and

limitations.

3.1 Linearity of the Electric Field

To produce the model for this estimation scheme, we begin as we did in Ch. 2 with a model

relating the electric field at the DM/pupil plane to the electric field at the image plane. Using

an arbitrary linear operator, C{·}, to account for the DM being at a plane non-conjugate to

the pupil we rewrite Eq. 2.1.5 as

Eim(x, y) = C{A(u, v)}+ C{A(u, v)g(u, v)}+ iC{A(u, v)φ(u, v)}. (3.1.1)

In the end we will still use matrix forms to compute the intensity distribution, but rather

than applying a matrix inner product to describe a single scalar value as in Eq. 2.1.9, we seek

the intensity at each pixel in the image. This requires calculating the magnitude squared of

each element in the image, so we will be evaluating the inner product for each scalar value in

the image. Given a particular DM shape, +φ, the intensity distribution at the image plane

69

is given by

I+ = < C{A}, C{A} > + < C{Ag}, C{Ag} > + < iC{Aφ}, iC{Aφ} > +

2<{< C{A}+ C{Ag}, iC{Aφ} >}+ 2<{< C{A}, C{Ag} >},(3.1.2)

We can now describe the interaction of DM actuation and aberrations over the entire control

area where we intend to create a dark hole. The only approximation made in this intensity

distribution is the linearization of the DM shape. Thus, while we have not actually lin-

earized about the aberrated field it must be small enough that the second order term in the

linearization used to produce Eq. 2.1.5 is negligible (in a single control step). Correspond-

ingly, we will find in the following sections that the estimate of the aberrated field is directly

dependent on the linearization of the DM shape.

3.2 Pairwise Images

In Ch. 2 we linearized about the DM shape so that we might create a quadratic cost function

to solve for the optimal control law provided a field. The goal of this chapter is to estimate

the electric field in the image plane given an image described by Eq. 3.1.2. We will do this

by modulating the DM and measuring the effect on the intensity distribution of the image

plane. The additive component of the DM in Eq. 3.1.2 is of no help, but the cross term of the

DM effect with the aberrated field will tell us how the modulation of the DM interacts with

the aberrated field to change the intensity distribution. To make this interaction the sole

observable quantity, we must eliminate all the other terms since they will add bias and noise

to the measurement. As pointed out by Borde and Traub [11], we cannot simply subtract

the image taken prior to applying the probe (making φ = 0) because this does not eliminate

the additive component of φ. As described by Borde and Traub [11], applying the negative

70

of the DM shape, −φ, the intensity distribution becomes

I− = < C{A}, C{A} > + < C{Ag}, C{Ag} > + < iC{Aφ}, iC{Aφ} > −

2<{< C{A}+ C{Ag}, iC{Aφ} >}+ 2<{< C{A}, C{Ag} >}.(3.2.1)

If we subtract Eq. 3.2.1 from Eq. 3.1.2 we find the residual to be

I+ − I− = 4<{< C{A}+ C{Ag}, iC{Aφ} >}. (3.2.2)

Thus taking difference images leaves us with the product of the DM probe field, C{Aφ}, with

the aberrated and nominal field. Difference imaging has the added benefit of removing any

static incoherent light sources, such as detector bias, stray light, and planet light, leaving

only the coherent component of the field to be measured. The only residual left in the

measurement is the interaction of the DM probe with the nominal field, < C{A}, iC{Aφ} >.

When the aberrations are much larger than the nominal field this will have a negligible

effect. Once the aberrations have been suppressed to a level close to that of the nominal

field this will become significant. However, recent progress showing that wavefront control

is equivalent to computing the profile for a pupil mapping coronagraph [30, 73] has shown

that we should in principle be capable of using our control to go below the nominal field the

coronagraph is designed for [39]. In this case it is not a residual because we want to include

this component of the field in the estimate so we can suppress it.

3.3 DM Diversity: Batch Process Estimation

The final step is to manipulate Eq. 3.2.2 to separate out the aberrated field in a matrix

form. To do so, we recognize that we only want the real part of the scalar inner product

71

between the two quantities and rewrite Eq. 3.2.2 as

I+ − I− = 4 (<{C{A}+ C{Ag}} · <{iC{Aφ}}+ ={C{A}+ C{Ag}} · ={iC{Aφ}})

= 4

[<{iC{Aφ}} ={iC{Aφ}}

]<{C{A}+ C{Ag}}

={C{A}+ C{Ag}}

. (3.3.1)

Eq. 3.3.1 separates the probe and aberrated fields into independent matrices but this equation

alone still leaves us with an underdetermined system, meaning the solution is non-unique.

The estimate will have a minimal norm, ||x||min, solution via the right pseudo-inverse instead

of providing an estimate with minimal least-squares error. To complete the DM-Diversity

estimator developed by Give’on et al. [23], we must produce an overdetermined system so

that we can write it as an unweighted batch process that produces an estimate of the aber-

rated field at the current control iteration with least-squares minimal error. The linearized

interaction of the DM probe and the aberrated field, Eq. 3.3.1, can be augmented by taking

multiple difference images using j pre-determined shapes. The image I+j is taken with one

deformable mirror shape, φj, while I−j is the image taken with the negative of that shape,

−φj, applied to the deformable mirror. The difference of each conjugate pair is then used to

construct a matrix of noisy measurements,

z =

I+

1 − I−1...

I+j − I−j

. (3.3.2)

Defining x as the image plane electric field state, we write z as a linear equation in x and

include additive noise, n,

z = Hx+ n (3.3.3)

which defines H as the observation matrix that relates the observed quantity to the state

we seek to estimate. By writing x as the real and imaginary parts of the electric field at a

72

specific pixel,

x =

<{C{Ag}}={C{Ag}}

, (3.3.4)

we can construct H so that it contains the real and imaginary parts of the jth DM pertur-

bation, C{Aφj}, in each row. With multiple pairs of images it takes the form

H = 4

<{C{Aφ1}} ={C{Aφ1}}

......

<{C{Aφj}} ={C{Aφj}}

. (3.3.5)

The product Hx will then match the intensity distribution in the measurement z. With at

least three measurements, j ≥ 3, we can take a left pseudo-inverse to solve for the estimate

of the real and imaginary parts of the aberrated field at each pixel in the image plane with

least-squares minimal error:

x = (HTH)−1HT z (3.3.6)

To write the system in full matrix form, the state x is stacked vertically for each pixel and

the observation matrix for a single pixel, H, is ordered into a larger block diagonal matrix.

In most cases, there are enough pixels in the dark hole that the dimension of H becomes very

large and too cumbersome for most mathematical programs to handle the matrix inverse.

So we must construct H as shown here to construct x pixel by pixel so that enough memory

is left to manage the experiment.

As we see from Eq. 3.3.6, the DM-Diversity algorithm is simply a least-squares batch

process estimator [68]. The pseudo-inverse minimizes the error because it is effectively aver-

aging the elements of H when the inverse is taken. The power of this algorithm also comes

from the difference imaging of conjugate probe shapes that are applied to the DM. We are

left only with the time-varying camera noise in our measurement, meaning that the sensor

noise n will follow a zero-mean Poisson distribution [35]. The problem becomes invertible

73

using two image pairs to construct z and H, but a minimum of 3 image pairs must be used

to create an overdetermined system that will produce a unique estimate with least-squares

minimal error from the available data [68]. Practically, we find that 4 image pairs must be

used to get a good enough estimate at the Princeton HCIL, largely to average model errors

and detector noise. Consequently, 8 images are taken per iteration to estimate the electric

field with the DM diversity algorithm. The algorithm has the advantage of being simple,

and relatively robust. The disadvantage is that the algorithm is fundamentally limited by

DM model uncertainty. The robustness of the algorithm comes with a high cost of exposures

that must be repeated every time, a major disadvantage in a system where the time required

for detection will be exposure limited.

3.4 Probe Shapes

With the DM-Diversity estimator in place we can explore the choice of the probe shapes, φj.

They must be chosen to modulate the estimation area well, otherwise the difference between

I+j and I−j would be so small that z would come close to zero. Even worse, the observation

matrix is constructed by computing the probe effect in the estimation area. If we choose

a probe that does not modulate the field in the estimation area well we will get rows that

are effectively zero, making H poorly conditioned. While our choice in the probe shapes is

somewhat arbitrary we must take care that they modulate the estimation area well enough

to produce a well posed problem in Eq. 3.3.6. We guarantee this by choosing shapes based on

analytical functions for which we know the Fourier transform. The DMs being non-conjugate

to the pupil plane will have little effect on this computation since we have shown that the

angular spectrum factor will simply add an additional phase distribution in the image plane.

Following Give’on et al. [23], we will simplify the problem of coverage/shape of the dark hole

by choosing two symmetric rectangular regions that span the region we wish to estimate.

Mathematically we produce a rectangle of width wx and height wy by multiplying two rect

74

functions, one for each dimension. Applying the inverse Fourier transform, the DM shape

required to produce this rectangle in the image plane is

F−1{ rect(wxx) rect(wyy)} = sinc(wxu) sinc(wyv). (3.4.1)

We offset the rectangle from the center by a distance a in the x dimension and a distance b in

the y direction by convolving it with two pairs of delta functions, one set for each coordinate.

The inverse Fourier transform of two symmetric delta functions is

F−1

{1

2[δ(x− a) + δ(x+ a)] ∗ 1

2[δ(y − b) + δ(y + b)]

}= cos(au) cos(bv) (3.4.2)

Applying an arbitrary amplitude, c, and the pupil function, A, the two offset rectangles in

the image plane generated by the DM shape φ are

F{Aφ} = F{A} ∗ c rect(wxx) rect(wyy)

∗ [δ(x− a) + δ(x+ a)] ∗ [δ(y − b) + δ(y + b)] (3.4.3)

= F{cA sin(wxu)} ∗ F{cA sinc(wyv))} ∗ F{cA cos(au)} ∗ F{cA cos(bv)}

= F{cA sinc(wxu) sinc(wyv) cos(au) cos(bv)}}

(3.4.4)

Inverse transforming, the shape we would like the DM to approximate is

φ = c sinc(wxu) sinc(wyv) cos(au) cos(bv). (3.4.5)

The coordinate offset for the delta functions and the width of the rect function is equal to

the frequency of the cosine and sinc functions respectively. Assuming linearity of the DM

actuation, Eq. 2.1.4, we have produced a phase distribution for one DM that results in a

unitary amplitude in two rectangular regions of the image plane. As discussed in Give’on

75

et al. [23], we must keep in mind that the true distribution at the image plane includes a

convolution with the nominal PSF, F{A}. The PSF will alter the field so that the the field

from a DM shape given by Eq. 3.4.5 will not have exact unit amplitude and the edges of

the rectangle will extend by one radius of the PSF. The distribution will still be relatively

uniform, so we are guaranteed to modulate the area under consideration with a reasonable

expectation that each pixel in the dark hole will also be modulated.

If we take the magnitude square of Eq. 3.4.3, we see that the intensity provided by the

probe shape will be proportional to the square root of the amplitude, c, of the shape produced

by Eq. 3.4.5. Generally, the amplitude of these shapes is prescribed by the normalized

intensity of the aberrated field. When we probe, we want to make a significant effect so that

there is a good signal in z, but we do not want to actuate so strongly that we wash out the

aberrations. Thus we choose an actuation amplitude equal to the square root of the average

contrast in the previous iteration [11].

The experimental results using probe pairs with the DM Diversity estimation algorithm

[23] are found in Ch. 5. This estimator is used to supply and estimate to both the windowed

and monochromatic forms of the stroke minimization algorithm developed in Ch. 2.

76

Chapter 4

Kalman Filter Estimation

The DM-Diversity algorithm described in Ch. 3 is quite effective, but it is limited by the

fact that it is only a batch process method. As shown in Fig. 4.1, it does not close the

loop on the state estimate. Therefore all state estimate information, x, acquired about the

electric field in the prior control step is lost. Thus we start over at each iteration, requiring

that we take a full set of estimation images to estimate the field again. In addition to being

Figure 4.1: Block diagram of a standard FPWC control loop. At any time step, k, onlythe intensity measurements, zk, provide any feedback to estimate the current state, xk, forcontrol. The red dashed lines show additional feedback from the prior electric field (or state)estimate, xk, and the control signal, uk, used to suppress it.

very costly with regard to exposures, the measurements will become progressively noisier as

we reach higher contrast levels. If we include feedback of the state estimate we will have

a certain degree of robustness to new, noisy measurements by including information from

77

prior measurements with better signal-to-noise. Since we already have demonstrated a model

based controller, we should be able to use this model to predict the change in the electric

field after the controller has applied a DM command. In doing so we do want to consider

the relative effect of process and detector noise to optimally combine an extrapolation of the

state estimate with new measurement updates. This is exactly the problem a discrete time

Kalman filter solves.

4.1 Constructing the Optimal Filter

A Kalman filter includes prior state estimate history by extrapolating a new estimate of the

state using a model, then optimally updating the estimate with a sequence of measurements

taken at discrete intervals. For the time being, the Kalman filter estimator will still use DM

probe pairs as described in §3 for the measurement update. In this way we are not testing

whether there is a better way to obtain information regarding the electric field, but rather

changing the way we use the information from the applied probe shapes to reconstruct the

electric field estimate. In particular, we use the prior information to improve our ability

to estimate the field in later iterations. This will allow us to use fewer measurements to

reconstruct the electric field.

Following the notation used in Stengel [68], we begin by assuming we have a state,

defined as the electric field estimate from the prior iteration, xk−1(+). The plus indicates

that the estimate was updated at the prior iteration with some measurement, be it from an

initialization or a prior estimation and control step. Prior to any additional measurements,

we will extrapolate from xk−1(+) to the current time step, xk(−), by an arbitrary function

(to be defined later in the chapter). We also seek a metric to gauge the uncertainty in

the estimate. Following Stengel [68], we define the extrapolated state estimate covariance,

Pk(−), as the expected value of the error between the estimate and the true state, xk:

Pk(−) = E[(xk(−)− xk)(xk(−)− xk)T ]. (4.1.1)

78

We now seek to optimally include new measurements to improve the state and covariance

estimates. These noisy measurements, zk = yk + nk, will still be difference images of probe

pairs. As discussed in Ch. 3, the conjugate pairs allow us to construct a linear observation

matrix, Hk, which stems from Eq. 2.1.9. If we were not in a low aberration regime the

observer would have to be nonlinear. This is not impossible for a Kalman filter, but can

make it highly biased [68] and computationally expensive. As we decide how to optimally

update the field, we must also have an estimate of the measurement noise covariance, which

we define as

Rk = E[nknTk ]. (4.1.2)

To properly demonstrate the conditions under which the Kalman filter is truly optimal,

we would have to show that propagating the state estimate is a Gauss-Markov sequence

(indicating that optimality requires white Gaussian inputs and Gaussian initial conditions)

[68]. This is rather tedious and can be found in many textbooks that discusses the Kalman

filter, so it will not be re-derived here. However, it is worth demonstrating that like the batch

process method the Kalman filter does produce an estimate with least-squares minimal error.

Since the Kalman filter operates on the estimate in closed loop, the weighted cost function

used to derive the batch process solution,

J =1

2[Hkxk − zk]TR−1

k [Hkxk − zk], (4.1.3)

will not adequately represent the error contributions in the system. We must also include an

estimate of the state covariance, since this will also propagate error in the estimate update.

Defining the error as both the difference between the noisy observation and the estimated

observation, Hkxk(+)− z, and the difference between the current estimate and the estimate

extrapolation, (xk − xk(−)), we write the quadratic cost function as

J =1

2

[xk − xk(−)]TPk(−)−1[xk − xk(−)

]+

1

2[Hkxk − zk]T R−1

k [Hkxk − zk] . (4.1.4)

79

We can formulate the cost in matrix form as

J =1

2

xk − xk(−)

Hkxk − zk

T Pk(−) 0

0 Rk

−1 xk − xk(−)

Hkxk − zk

(4.1.5)

=

IHk

xk −xk(−)

zk

T Pk(−) 0

0 Rk

−1

IHk

xk −xk(−)

zk

=1

2(Hkxk − zk)T R−1

k (Hkxk − zk), (4.1.6)

where we have now defined a new set of augmented matrices as

Hk =

IHk

, (4.1.7)

zk =

xk(−)

zk

, (4.1.8)

Rk =

Pk(−) 0

0 Rk

. (4.1.9)

This allows us to write an analog to the weighted cost function used to compute the batch

process estimator. Taking the partial derivative with respect to the state estimate xk and

evaluating at the optimal update, xk(+), we find

∂J(zk)

∂xk

T ∣∣∣∣xk(+)

= HTk R−1k Hk − HT

k R−1k zk. (4.1.10)

80

Evaluating the partial derivative at zero, the optimal state update is

xk(+) = (HTk R−1k Hk)

−1HTk R−1k zk (4.1.11)

=[Pk(−)−1HT

k RkHk

]−1 [Pk(−)−1xk(−) +HT

k R−1zk

]=[Pk(−)− Pk(−)HT

k (HkPk(−)HTk +Rk)

−1HkPk(−)]

·[Pk(−)−1xk(−) +HT

k R−1k zk

]= xk(−) + Pk(−)HT

k

[HkPk(−)HT

k +Rk

]−1[zk −Hkxk(−)] . (4.1.12)

From Eq. 4.1.12, we define the optimal gain to be

Kk = Pk(−)HTk [HkPk(−)HT

k +Rk]−1. (4.1.13)

Eq. 4.1.13 optimally combines the prior estimate history with measurement updates

to minimize the total error contributions based on the expected state and measurement

covariance. Much like the batch process method the Kalman filter produces a solution

that minimizes a quadratic cost function, Eq. 4.1, but it is also subject to the constraining

dynamic equations given by xk(−) and Pk(−). However, looking at Eq. 4.1.6 there is a

major advantage of the Kalman filter in its minimization of the cost function. For Hk to

be overdetermined, we only require a single measurement. Thus, at a fundamental level the

Kalman filter is formulated in such a way that it solves a least squares, left pseudo-inverse

problem, regardless of the number of measurements taken. This gives us the freedom to

minimize the number of exposures required to estimate the field to a precision adequate for

suppressing the field to the target contrast level.

While the form of the cost functions are the same, they are evaluating different crite-

ria. Consequently, we cannot use the cost functions to directly compare their optimality.

Instead we look to the only other metric of comparison, the covariance at each iteration. To

update the state covariance estimate, Pk(+), we continue to use the augmented matrices in

81

Eqs. 4.1.7-4.1.9. Beginning with the expected value function shown in Table 4.1, we write

Pk(+) = E[(xk(+)− xk)(xk(+)− xk)T ] (4.1.14)

= E

[[(HT

k R−1k Hk)

−1HTk R−1k nk

] [(HT

k R−1k Hk)

−1HTk R−1k nk

]T]=[(HT

k R−1k Hk)

−1HTk R−1k

]E[nkn

Tk ][(HT

k R−1k Hk)

−1HTk R−1k

]T= (HT

k R−1k Hk)

−1

= [Pk(−)−1 +HTk R−1k Hk]

−1. (4.1.15)

For comparison, we evaluate the covariance of a weighted form of the batch process method

described in Ch. 3, which is

P = E[(x− x)(x− x)T

](4.1.16)

= E[(HTR−1HTR−1n)HTR−1HTR−1n)T ]

= (HTR−1HTR−1)E[nnT ](HTR−1HTR−1)T

=(HTR−1H

)−1. (4.1.17)

As shown in Eq. 4.1.17, the state covariance of the batch process method resets after every

control step, and is tied to the noise in that particular set of measurements. However, the

covariance of the Kalman filter is also a function of the prior state covariance. Looking at

Eq. 4.1.15, HTk R−1k Hk, is guaranteed to be positive definite. Thus additional measurements

taken at each iteration will act to reduce the magnitude of the covariance since additional

measurements can do nothing but make the inversion smaller.

We can use the contrast normalization for the measurements, I00, to get an idea of the

estimator’s robustness. Looking ahead to §6.1, we see that I00 is a function of exposure time.

Thus if we do not take a long enough exposure in the probe images Rk will become quite

large, indicating a poor signal to noise ratio. This exposure time is based on the detection

limit for a given laser power, as discussed in §6.1. For the 2 mW laser power used in the

82

monochromatic experiments, this is on the order of 100 ms to detect 1×10−7 contrast levels.

In the broadband experiments, the power levels are on the order of a microwatt for any given

wavelength which means we require exposure times on the order of 10’s of seconds. In the

batch-process estimator, we are stuck with these measurements and will receive an estimate

with large covariance. In this case the control will not be effective, which is why we often see

jumps in contrast when using this estimator once we reach low contrast levels. In the case of

the Kalman filter, this high covariance is dampened by the contribution of prior covariance

estimates via Pk(−), stabilizing the state estimate and its covariance in the event that we

take a bad measurement. Since we cannot guarantee that a probe will provide good signal,

particularly at low contrast levels, this is an extremely attractive component of the Kalman

filter estimator. To complete the filter we need to propagate the prior estimate, xk−1(+),

to the current time step. The filter extrapolates to the current state estimate, xk(−), by

applying a time update to the prior state estimate via the state transition matrix, Φk−1, and

numerically propagating the control output from stroke minimization at the prior iteration,

uk−1, via a linear transformation described by Γk−1. We also have a disturbance from the

process noise, wk−1, which is propagated to the current state of the electric field via the

linear transformation, Λk−1. Assuming these components are additive, the state estimate

extrapolation is

xk(−) = Φk−1xk−1(+) + Γk−1uk−1 + Λk−1wk−1. (4.1.18)

We will apply the linearized optical model used to develop the batch process estimation

method and both control algorithms described in Ch. 2 and Ch. 3. Using a linearized model

avoids generating arbitrary bias in the estimate at each pixel, a common problem with a

nonlinear filter [21]. The first step in propagating the state forward in time is to update any

dynamic variation between the discrete time steps with the state transition matrix, Φk−1.

In this system, Φk−1 captures any variation of the field due to temperature fluctuations,

83

vibration, or air turbulence that perturb the optical system. To simplify the model, we

recognize that there is no reliable way to measure or approximate small changes in the

optical system over time with alternate sensors; we assume that the state remains constant

between control steps, making the state transition matrix, Φk−1, Φk−1 = Φ = I.

The process noise is any disturbance input into the system. The dominant contributor to

this will be errors in our expectation of the DM actuation, which will be discussed in greater

detail following this derivation. For the time being, we make the standard assumption that

the process noise is gaussian white noise, which means it’s expected value is zero. Thus, the

expected value of the state when we extrapolate is

xk(−) = Φk−1xk−1(+) + Γk−1uk−1. (4.1.19)

The covariance of the process noise will be handled in the covariance extrapolation. For

the same reason that we may treat Φ as a constant matrix, the optical system is assumed

to be stable enough that the linearized propagation of the control uk−1 is constant, making

Γk−1 = Γ. This matrix must map the control effect of the DM actuators to the image

plane electric field, but we need to sort it such that every pair of rows in Γu is the real

and imaginary parts of a particular pixel. To begin, we look to the control effect matrices

produced in Ch. 2. Recalling Eq. 2.1.14, We can produce a vector of complex values via

Gu = C{iAφ}. (4.1.20)

To produce gamma, we simply need to sort the control effect into the real and imaginary

parts per pixel. We have to take the real and imaginary parts pixel by pixel so that each

block element of the matrix forms as<{G}n,:={G}n,:

=

<{GDM1}n,: <{GDM2}n,:={GDM1}n,: ={GDM2}n,:,

(4.1.21)

84

where (n, :) indicates that we have taken all columns of the nth row in G. This block,

Gn,:, maps the effect of every DM actuator onto the nth pixel. We must reorganize the

control matrix in this manner for the sake of Hk. If we were to reorganize the state and

control vectors so that the real and imaginary components were stacked such that x =

[<{C{Ag}} ={C{Ag}}]T , Hk would be arranged in a sparse form rather than as a block

diagonal matrix. Thus, each submatrix for Γ shown in Table 4.1 is of dimension 2× 2NDM

and represents the control effect on a single pixel of the matrix.

Following Stengel [68], we use the state transition matrix to propagate the prior covariance

estimate forward. Applying an additive term for the process noise, Qk−1, the extrapolated

covariance estimate is

Pk(−) = Φk−1Pk−1(+)ΦTk−1 +Qk−1. (4.1.22)

where

Qk = EwkwTk . (4.1.23)

The details of how we formulate Qk, and the sensor noise, Rk, will be addressed in more

detail in §4.2. Combining Eq. 4.1.12, Eq. 4.1.13, and, Eq. 4.1.15 with the extrapolation

equations used in the cost function, we have the discrete time Kalman filter. This form of

the filter consists of five equations that describe the state estimate extrapolation, covari-

ance estimate extrapolation, filter gain computation, state estimate update, and covariance

estimate update at the kth iteration [68]:

xk(−) = Φk−1xk−1(+) + Γk−1uk−1. (4.1.24)

Pk(−) = Φk−1Pk−1(+)ΦTk−1 +Qk−1 (4.1.25)

Kk = Pk(−)HTk

[HkPk(−)HT

k +Rk

]−1(4.1.26)

xk(+) = xk(−) +Kk [zk −Hkxk(−)] (4.1.27)

Pk(+) =[Pk(−)−1 +HT

k R−1k Hk

]−1(4.1.28)

85

A fundamental property of the Kalman filter is that the optimal gain, Eq. 4.1.26, is

not based on measurements, but rather estimates of the state covariance, Pk(−), process

noise from the actuation Qk−1, and sensor noise Rk. This means that the optimality of the

estimate is closely related to the accuracy and form of these matrices; this will be discussed

at length in §4.2. The gain matrix, Kk, is ultimately what balances uncertainty in the prior

state estimate against uncertainty in the measurements zk when computing the final state

estimate update, xk(+).

Matrix Dimension

Φ = I (2 ·Npix)× (2 ·Npix)

Γ =

[<{GDM1} <{GDM2}={GDM1} ={GDM2}

]1

...[<{GDM1} <{GDM2}={GDM1} ={GDM2}

]n

(2 ·Npix)× (2 ·NDM)

Λ = Γ (2 ·Npix)× (2 ·NDM)

P0 = E[(x0 − x0)(x0 − x0)T ] (2 ·Npix)× (2 ·Npix)

Qk = ΛE[wkwTk ]ΛT (2 ·Npix)× (2 ·Npix)

Hk = diag

<{GDM2φk1} ={GDM2φk1}

......

<{GDM2φkj} ={GDM2φkj}

n

(Npix ·Npairs)× (2 ·Npix)

Rk = E[nknTk ] (Npix ·Npairs)× (Npix ·Npairs)

Kk is computed (2 ·Npix)× (Npix ·Npairs)

Table 4.1: Definition of all filter matrices. NDM is the number of actuators on a single DM,Npix is the number of pixels in the area targeted for dark hole generation, and Npairs is thenumber of image pairs taken while applying positive and negative shapes to the deformablemirror

With Hk, zk, Γ, x, and uk constructed, the dimension and form of the rest of the filter

follows. Table 4.1 and Table 4.2 define all the matrices and vectors in the filter equations

86

for this problem and provides their dimensionality for clarity. The initialization of the

covariance, P0, is critical for the performance of the filter. In our system this cannot be

measured, so we must initialize with a reasonable guess. We can use the final covariance

matrix from a prior control attempt (to maximum achievable contrast) to initialize the filter

in the future so that its form might be more accurate. We compute the process noise

assuming a standard zero-mean variance on the actuation of the DMs, w. The sensor noise

is determined statistically from the readout noise that the detector exhibits when taking

dark frames. As in Ch. 3, the focal plane measurements zk are identical to that of §3, and

are constructed into a vertical stack of difference images taken in a “pair-wise” fashion to

produce j measurements for n pixels. Likewise Hk takes on a similar form, and is a matrix

constructed from the effect of a specific deformable mirror shape φj on the real and imaginary

parts of the electric field in the image plane. Finally, we compute the covariance update,

Pk(+), based on the added noise from the new measurements. The estimated state is a

vertical stack of the real and imaginary parts of the electric field at each pixel of the dark

hole in the image plane. The control signal u is a vertical stack of the actuators of each DM,

with DM1 being stacked on top of DM2. Since we are only considering process noise at

the DMs, the process disturbance w is a vertical stack of the variance expected from each

actuator.

Recalling Eq. 3.3.1, Hk is constructed by separating the real and imaginary parts of

the DM probe field. Thus it will be underdetermined unless at least 2 pairs of images

are used in the measurement, one of the major limitations of the batch process method in

Ch. 3. This will result in a non-unique solution to the state when using a batch-process,

and will only provide the solution with the smallest quadratic norm since it must be solved

via the right pseudo-inverse. On the other hand, the Kalman filter only requires a single

measurement as an update to the state. Therefore it isn’t necessary for the matrix to be

square or overdetermined, and we maintain a favorable dimensionality when updating the

state.

87

Variable Dimension

z =

I+1 − I−1

...I+j − I−j

1

...I+1 − I−1

...I+j − I−j

n

(Npix) · (Npairs)× 1

x =

[<{E1}={E1}

]...[

<{En}={En}

] (2 ·Npix)× 1

u =

[DM1DM2

](2 ·NDM)× 1

w =

[σDM1

σDM2

](2 ·NDM)× 1

Table 4.2: Definition of filter vectors. NDM is the number of actuators on a single DM,Npix is the number of pixels in the area targeted for dark hole generation, and Npairs is thenumber of image pairs taken while applying positive and negative shapes to the deformablemirror

In the scenario of coronagraphic imaging, we are photon limited which typically means

exposure time is by far the limiting factor when estimating the field. However, there a

lot of mathematical operations involved with a Kalman filter. It is worth looking at the

computation time required to compute the update since it will ultimately limit the speed

with which we may estimate the field. The number of mathematical operations follows from

the dimension of the matrices given in Table 4.1 and Table 4.2. Thus, the computation is

directly dependent on the size of the dark hole, the number of actuators on the DMs, and

the number of measurements taken at each step. For a fixed dark hole size and a set number

of actuators, all we can do is attempt to minimize the number of measurement updates per

iteration. Presumably we could bin the camera to reduce the number of pixels, Npix, but

88

this will not benefit a true observatory since the image plane is Nyquist sampled (implying

a loss of necessary spatial information if we bin the detector). The number of actuators is

not a limiting factor in any space or ground telescope design to date, but this does bring to

light a general control and estimation challenge for the next generation of extremely large

telescopes (ELTs). AO studies for ELTs are investigating DMs with ≈ 40,000 actuators or

more, meaning there will be a numerical challenge for any estimation and control scheme

(even a conventional atmospheric AO scheme)[41]. In this case, we would likely have to come

up with a way to reduce the dimension of the problem. However, in current observatory

scenarios the highest available actuator count is limited to 4,000 actuators. In this case

we will certainly be limited by exposure time even if we estimate the field over the entire

controllable area of the DM (since exposure times suitable for high contrast detection are

on the order of minutes to hours). In the laboratory experiment, with 2 mW of laser power

using the Ripple3 coronagraph[5] the computation time for the estimation step is on the

order of seconds, which is much faster than the field variability at the 1× 10−7 level. Since

readout takes approximately 0.6 seconds on the Starlight Xpress SXV-M9 camera, exposure

time is a significant fraction of the time required to estimate the field, even in the case of

high laser power levels. Thus, the speed of the Kalman filter computation is not limiting our

current achievable contrast levels.

4.2 Sensor and Process Noise

Two important design parameters for the performance of the filter are the process noise,

Qk−1, and the sensor noise, Rk. In order for the filter to operate optimally in the laboratory

we must make reasonable assumptions for the values that exist in the laboratory. Rather

than running a simulation to find the most likely values, we appeal to physical scaling of the

two largest known sources of error in the system. Our sensor noise will be the dark current

and read noise inherent to the detector. Process noise will largely come from errors in the

89

actuation shape.

Defining the control perturbation for the Kalman filter such that it is treated as an

additive error allows us to appeal to the value of the process noise Q physically, since we do

not have a way to measure it. For a disturbance, wk, we define the covariance of the process

noise as

Q′k = E[wkw

Tk

]. (4.2.1)

Following the definitions for Q in Stengel [68], the process noise at the image plane is

Qk = ΛQ′kΛT , (4.2.2)

where Λ propagates the process noise to the image plane. To propagate process noise onto

the image plane, Qk = ΛE[wwT ]Λ, we assume w is solely from the variance in DM actuation.

This allows us to choose Λk−1 = Λ = Γ. We can trace errors of the actuation shape to two

sources. The first is the resolution of the digital-to-analog (D/A) converter, which is 14-

bits over a 250 Volt range. This gives us a resolution of approximately 0.015 Volts, which

corresponds to approximately 0.03 nm vertical resolution. This is much more precise than

the surface knowledge, which is the true limitation. Poor knowledge of the surface comes

from the inherent nonlinearity in the voltage-to-actuation gain as a function of voltage,

the variance in this gain from actuator to actuator, and the accuracy of the superposition

model used to construct the mirror surface that covers the 32x32 actuator array of the

Boston Micromachines kilo-DM. Physical models, such as those found in Blain et al. [9],

have been constructed to produce a more accurate surface prediction over the full 1.5µm

stroke range with an rms error of ≈ 10 nanometers. Since we operate in a low actuation

regime, superposition is still a relatively safe assumption as evidenced by current laboratory

success. The Kalman filter presents an elegant solution where we can treat actuation errors

as additive process noise and include them in the estimator in a statistical fashion, rather

than deterministically in a physical model. Since there is no physical reasoning to justify

90

varying Qk at each iteration, it will be kept constant throughout the entire control history

(Q′k = Q′ = constant). Two versions of Q′ can be considered in this case. The first is where

we simply have no correlation between actuators, giving a purely diagonal matrix with a

magnitude corresponding to the square of the actuation variance, σu. Note that while Q′ is

diagonal the process noise at the image plane, Q, will not be diagonal because of the linear

transformation to propagate this variance to the image plane via Γ. The second version of

Q′ which we may consider is one that has symmetric off-diagonal elements. This will treat

uncertainty due to inter-actuator coupling and errors in the superposition model statistically.

As a first step, we will not consider inter-actuator coupling to help avoid a poorly conditioned

matrix. This helps guarantee that the Kalman filter itself will be well behaved. Thus the

process noise for the filter will be

Q = σ2uΓIΓT . (4.2.3)

Following Howell [35] the noise from both the incident light and dark noise in a CCD

detector follows a Poisson distribution. This will lead to an estimator that is not truly

optimal for short exposures, but the Poisson distribution will more closely approximate

a Gaussian distribution with non-zero mean for an adequately long exposure. We could

subtract a median dark frame, but differencing pairwise images allows us to construct a

linear observer in matrix form. The noise in each measurement will have zero-mean and will

become more Gaussian as the exposure time increases.

The statistics for the HCIL detector at relevant exposure times are shown in Fig. 4.2.

Here we simplify the noise statistics by assuming it is uncorrelated and constant from pixel

to pixel, making Rk a diagonal matrix of the mean pixel covariance in units of contrast.

Since we do not have the ability to measure this variance actively the assumption is that the

CCD and laser source have been thermally stabilized, which will keep the variance constant

91

3385 3390 3395 3400 3405 34100

0.5

1

1.5

2

2.5x 10

4 Distribution of Dark Frame for 50ms

Counts

Occu

ren

ce

(a) 50 ms Dark

3385 3390 3395 3400 3405 34100

0.5

1

1.5

2

2.5x 10

4

Counts

Occu

ren

ce

Distribution of Dark Frame for 200ms

(b) 200 ms Dark

Figure 4.2: Average counts across the detector for a dark frame. At short exposure times thisfollows a Poisson distribution. At much longer exposure times it should be well approximatedby gaussian distribution, but is still slightly Poisson at 200ms.

over time, thus defining the sensor noise as

R =σCCDI00

INpairs×Npairs . (4.2.4)

Having appealed to physical scaling in the HCIL, we now have close approximations of

the true process and sensor noise exhibited in the experiment. With an appropriate, non-

zero, initialization of the covariance we will be able to produce an effective optimal gain that

will leverage the data at each iteration as much as possible to produce a new state estimate

update.

4.3 Iterative Kalman Filter

An additional advantage of the Kalman filter is that we may apply the filter iteratively,

feeding the newly computed state xK(+) and covariance update Pk(+) back into the filter

again, setting uk−1 to zero. For sufficiently small control this will help account for nonlin-

earity in the actuation and better filter noise in the system, limited only by the accuracy of

92

the observation matrix, Hk. With no control update, the control signal will be set to zero

when we iterate the filter. Following a notation similar to Gelb et al. [21], the jth iteration

of feedback into the iterative Kalman filter at the kth control step is

xj,k(−) = xj−1,k−1(+) (4.3.1)

Pj,k(−) = Φj−1,k−1Pj−1,k−1(+)ΦTj−1,k−1 +Qj−1,k−1 (4.3.2)

Kj,k = Pj,k(−)HTj,k

[Hj,kPj,k(−)HT

j,k +Rj,k

]−1(4.3.3)

xj,k(+) = xj,k(−) +Kj,k [zj,k −Hj,kxj,k(−)] (4.3.4)

Pj,k(+) =[Pj,k(−)−1 +HT

j,kR−1j,kHj,k

]−1. (4.3.5)

The power of iterating the filter lies in what we are fundamentally trying to achieve. For

a successful control signal, we will have suppressed the field. This means that the magnitude

of the probe signal will be lower than the control perturbation. This guarantees that Hk

will better satisfy the linearity condition than Γu. As a result, if we iterate the filter on

itself during a given control step we can use the discrepancy between the image predicted by

Hkxk(+) and the measurements, zk, in Eq. 4.3.4 to filter out any error due to nonlinear terms

not accounted for in Γ. In this way, we can accommodate a small amount of nonlinearity in

the extrapolation of the state without having to resort to a nonlinear, or extended, Kalman

filter. This means that we don’t have to re-linearize about xk(+), as would be the case for

an iterative extended Kalman filter (IEKF). It also avoids having to concern ourselves with

any bias introduced into the estimate by a nonlinear filter.

4.4 Optimal Probes: Using the Control Signal

In §3.4 we discussed the choice of probe shapes to create a well posed problem. In principle,

we have found shapes that adequately probe the field by perturbing the field as uniformly as

possible. However, nobody has ever looked deeply into the true merit of these functions or

93

how to choose the “best” shapes to probe the dark hole. In any dark hole there are discrete

aberrations that are much brighter than others, requiring that we apply more amplitude to

those spatial frequencies. Conversely the bright speckles raise the amplitude of the probe

shape, which is too bright for to take a good measurement of dimmer speckles. Excluding

this issue, we also cannot truly generate the analytical functions described in §3.4. Even the

DM with the highest actuator density available, the Boston Micromachines 4K-DM©, can

only approximate each function with 64 actuators. We account for the true shape in the

model but this shape does not truly probe each pixel in the dark hole with equal weight,

which was the primary advantage of the analytical function for a probe shape in the first

place. Fortunately, we can once again appeal to the mathematical model for estimation and

control to help determine an adequate probe shape. Once one estimate has been provided,

the control law determines a shape to suppress all the speckles in the dark hole. Since it

has suppressed the field, this control shape necessarily probes the aberrated field in the

dark hole. If we apply the conjugate of the control shape we will increase the energy of

the aberrated field. Thus, we can rely on the controller to compute shapes that optimally

probe the aberrated field and automatically choose amplitudes appropriate for the intensity

at each pixel in the dark hole.


§4.1 Filter Construction:

• The linear form of the filter derived here requires the linearized models of the image

plane electric field developed in Ch. 1 and Ch. 2. The merit of this linearization over

an extended Kalman filter and our ability to accommodate any nonlinearities via filter

recursion is discussed in this section.

§4.2 - Sensor and Process Noise:

94

• All process noise is limited to uncertainty in the DM actuation, and does not account

for interactuator coupling.

• All sensor noise is limited to the dark current of the CCD. The noise has zero-mean,

and over long exposures the Poisson distribution describing this noise will closely ap-

proximate white gaussian noise.

95

Chapter 5

Laboratory Results

In Chapters 2, 3, and 4 we developed a number of estimation and control algorithms that

make up the focal plane wavefront correction algorithm designed to recover regions of high

contrast in finite areas of the image plane. In this chapter we present the results of experi-

ments testing these correction algorithms at the Princeton HCIL and their ability to produce

symmetric dark holes in both monochromatic and broadband light. Since the purpose is to

develop and test the performance of the controller itself, we do not apply any post-processing

techniques to remove so-called “incoherent” components of the electric field [33]. This is not

done because it is possible that model error will look like incoherent light and will be falsely

subtracted, adding uncertainty to our performance. More importantly, our purpose is to test

the performance of the correction algorithms, which is different than testing our ability to

detect a planet. The values reported in this thesis demonstrate a situational performance

during an observation, rather than relying on a post-processing technique to achieve a higher

contrast level.

5.1 Monochromatic Performance

To begin, we demonstrate the monochromatic performance of the Princeton HCIL using

the stroke minimization correction algorithm derived in §2.1. This allows us to compare

96

the performance of the DM Diversity and Kalman filter estimators in the simplest scenario

where we do not have to consider the effect of bandwidth on their performance. Overall, we

demonstrate our ultimate achievable contrast and the ability of the Kalman filter to more

efficiently suppress the field by requiring fewer estimation exposures.

5.1.1 DM Diversity Performance

As discussed in Ch. 3, the DM diversity estimator can produce a unique solution with as

few as two measurements, each a difference image of conjugate DM shapes. This is the

number of measurements used at the JPL HCIT for estimation at each iteration, but at the

Princeton HCIL there is enough uncertainty in the system that we require a minimum of

three measurements (6 images) to take advantage of the averaging effect of the left pseudo-

inverse. Practically, we find that the DM diversity estimator requires four measurements (8

images total) to reach our ultimate achievable contrast levels. This is the baseline image set

for comparing to the performance of the Kalman filter in §5.1.2. The laboratory starts at

λ0/D

λ0/D

Initial Image Diversity Estimator

−10 −5 0 5 10

−10

−5

0

5

10

log 1

0(C

ontrast)

−6.5

−6

−5.5

−5

−4.5

−4

(a)

1 11 21 30

10−6

10−5

10−4

Co

ntr

ast

Iteration

Contrast Plot

AVG Contrast

Left Contrast

Right Contrast

(b)

λ0/D

λ0/D

Best Contrast Diversity Estimator

−10 −5 0 5 10

−10

−5

0

5

10

log 1

0(C

ontrast)

−6.5

−6

−5.5

−5

−4.5

−4

(c)

Figure 5.1: Experimental results of sequential DM correction using the DM-Diversity esti-mation algorithm. The dark hole is a square opening from 7–10 × -2–2 λ/D on both sidesof the image plane. (a) The aberrated image. (b) Contrast plot. (c) The corrected image.Image units are log(contrast).

an initial contrast of 1.23× 10−4 (Fig. 5.1(a)). Using the least-squares estimation technique

it is capable of reaching an average contrast of 2.3 × 10−7 in a (7–10)×(-2–2) λ/D region

97

within 30 iterations (Fig. 5.1(c)) on both sides of the image plane, a unique capability that is

a result of the two deformable mirrors in the system. The size of the dark hole is limited by

our certainty in the DM shape. As we increase the outer working angle of the dark hole, we

need better certainty of the DM at higher spatial frequencies to maintain the same contrast

level. This is compounded by the fact that we have two DMs in series, but this enables us

to create symmetric dark holes in the image plane.

5.1.2 Kalman Filter Performance

In this section we correct the field using the Kalman filter from Ch. 4 for estimation using

four, three, two, and one pair of images as a measurement update to assess the degradation

in performance as information is lost. We begin with four measurements (four image pairs),

to compare its performance using the same number of measurements as the batch process

estimator to produce the results of §5.1.1. Using 4 pairs, the filter achieved a contrast of

4.0× 10−7 in (7-10)x(-2-2) λ/D symmetric dark holes within 20 iterations of the controller,

shown in Fig. 5.2. Note that this used a total of 160 estimation images, which is the same

amount of information available to the batch process method in §5.1.1 when it achieved a

contrast of 3.5× 10−7 in 20 iterations.

λ0/D

λ0/D

Initial Image

−10 −5 0 5 10

−10

−5

0

5

10

−6.5

−6

−5.5

−5

−4.5

−4

(a)

0 10 20

10−6

10−5

10−4

Contr

ast

Iteration

Contrast Plot

AVG Contrast

Left Contrast

Right Contrast

(b)

λ0/D

λ0/D

Final Image: 4 Pairs

−10 −5 0 5 10

−10

−5

0

5

10

−6.5

−6

−5.5

−5

−4.5

−4

(c)

Figure 5.2: Experimental results of sequential DM correction using the discrete time extendedKalman filter with 4 image pairs to build the image plane measurement, zk. The dark hole isa square opening from 7–10 × -2–2 λ/D on both sides of the image plane. (a) The aberratedimage. (b) Contrast plot. (c) The corrected image. Image units are log(contrast).

98

When the number of image pairs is reduced to three, the correction algorithm was still

able to reach a contrast level of 5.0 × 10−7 using only 120 estimation images, as shown in

Fig.5.3. Having proven that we can successfully reach very close to the same limits with fewer

λ0/D

λ0/D

Initial Image

−10 −5 0 5 10

−10

−5

0

5

10

−6.5

−6

−5.5

−5

−4.5

−4

(a)

0 10 20

10−6

10−5

10−4

Contr

ast

Iteration

Contrast Plot: 3 Pairs

AVG Contrast

Left Contrast

Right Contrast

(b)

λ0/D

λ0/D

Final Image: 3 Pairs

−10 −5 0 5 10

−10

−5

0

5

10

−6.5

−6

−5.5

−5

−4.5

−4

(c)


exposures, we now tune the covariance initialization and noise matrices and attempt only

using two pairs of images. By reducing the number of image pairs to two, we are using half

as many images as the correction shown in §5.1.1 and have reached a point where the batch

process method will no longer provide a solution that takes advantage of the averaging effect

of the left pseudo-inverse solution. After further tuning the covariance and noise matrices,

the contrast achieved after 30 iterations of the correction algorithm was 2.3 × 10−7, shown

in Fig. 5.4. Note that this is better than the case which used three pairs because we have

improved the covariance initialization and increased the number of times the filter is iterated

in a single control step. In fact it should be noted that making the filter iterative is critical

to its performance since it accounts for nonlinearity, particularly in the propagation of the

control.

Reducing the number of measurements to a single pair we find a very interesting result.

The quality of the measurement at any particular time step of the algorithm is now dependent

on the quality of that particular probe shape. As a result, if the probe does not happen to

99

λ0/D

λ0/D

Aberrated Field

−10 −5 0 5 10

−10

−5

0

5

10

−6.5

−6

−5.5

−5

−4.5

−4

(a)

0 10 20 30

10−6

10−5

10−4

Contr

ast

Iteration

Contrast Plot

AVG Contrast

Left Contrast

Right Contrast

(b)

λ0/D

λ0/D

Corrected Contrast: 2.3 × 10−7

−10 −5 0 5 10

−10

−5

0

5

10

−6.5

−6

−5.5

−5

−4.5

−4

(c)


modulate the field well the field estimate gets worse. It is also important to cycle through

the probe shapes. A single probe may not modulate a specific location of the field well, so we

must choose a different probe shape to guarantee that we adequately cover the entire dark

hole. Starting from an aberrated field with an average contrast of 9.418× 10−5, Fig. 5.5(a),

we achieved a contrast of 3.1×10−7 in 30 iterations and 2.5×10−7 in 43 iterations of control,

Fig. 5.5(c). Looking at the contrast plot in Fig. 5.5(b), the sensitivity of a single measurement

update to the quality of the probe is very clear. What is interesting, however, is that the

modulation damps out over the control history. While we do not suppress as quickly in earlier

iterations, as in the case with more probes, we achieve our ultimate contrast levels in almost

the exact same number of iterations. This is a direct result of developing good coverage

across the dark hole over time by changing the probe shape at each iteration. Thus, even

with one measurement update at each iteration the prior state estimate history stabilizes the

estimate in the presence of the measurement update’s poor signal-to-noise at high contrast

levels. What is further encouraging is that this is still applying the arbitrary probe shapes

derived in Ch. 3. If we were to intelligently choose our probes, as discussed in Ch. 4, we will

see a dramatic improvement in the rate of convergence for a single measurement update.

A very promising aspect of this estimation scheme is that its performance did not degrade

100

λ0/D

λ0/D

Aberrated Field

−10 −5 0 5 10

−10

−5

0

5

10

log 1

0(C

ontrast)

−6.5

−6

−5.5

−5

−4.5

−4

(a)

0 10 20 30 40

10−6

10−5

10−4

Contr

ast

Iteration

Contrast Plot

AVG Contrast

Left Contrast

Right Contrast

(b)

λ0/D

λ0/D

Corrected Contrast 2.5 × 10−7

−10 −5 0 5 10

−10

−5

0

5

10

log 1

0(C

ontrast)

−6.5

−6

−5.5

−5

−4.5

−4

(c)

Figure 5.5: Experimental results of sequential DM correction using the discrete time extendedKalman filter with one image pair to build the image plane measurement, zk. The dark holeis a square opening from 7–10 × -2–2 λ/D on both sides of the image plane. (a) Theaberrated image. (b) Contrast plot. (c) The corrected image. Image units are log(contrast).

significantly as the amount of measurement data was reduced. With only 86 estimation

images it was capable of reaching the same final contrast (within measurement uncertainty)

achieved by the DM diversity algorithm in §5.1.1, which achieved a contrast of 2.5 × 10−7

in 30 iterations. The batch process required 240 images to maintain an estimate of the

entire control history, achieving a contrast of 2.3 × 10−7. Thus by making the estimation

method more dependent on a model we were able to reduce our need to measure deterministic

perturbations in the image plane electric field.

5.2 Broadband Performance

To take spectra of any planets we discover in our dark hole, we need to extend the experi-

mental results of §5.1 to broadband light so that the planet is detectable in more than one

wavelength. In §2.4 we developed the Windowed Stroke Minimization algorithm with esti-

mate extrapolation to accomplish this task. Here we show the results of these experiments,

and point to an interesting laboratory limitation that required an upgrade of the optical fiber

used as the point source in the experiment. The results in §5.2.1 are prior to this upgrade,

and are included for the sake of comparison. §5.2.2 shows the most current results from the

101

Princeton HCIL producing symmetric dark holes in a targeted 10% band around λ0 = 633

nm. In all cases, the results we present are in a dark hole region with dimension 7 –10 ×

-2 – 2 λ0/D. The contrast measurement is pinned to the central wavelength so that we can

pin the performance to a fixed sky angle, α, defined as α = tan−1(nλ0/D). In a 10% band

the physical shift is less than one pixel at the HCIL, and the controller corrects an area of 6

–11 × -3 – 3 λ0/D. In this way we do not have areas that the controller has not corrected

leaking light into the dark hole, skewing our measurement of the controller performance.

5.2.1 Prior to Single Mode Photonic Crystal Fiber

In the first broadband experiments at the HCIL, the output fiber shown in Fig. 1.5 was

simply a 633 nm single mode fiber. The correction was performed at 620 rather than 633

nm as in the other experiments, partially because of filter availability. In this experiment

we have tested the performance of the Windowed Stroke Minimization algorithm of §2.4

over a 10% bandwidth. The estimate for the filters bounding the 10% target bandwidth

are computed using the estimate extrapolation technique developed in §2.5. Starting at

an average contrast of 1.2740 × 10−4 (Fig. 5.6(c)) over the five filters spanning our 10%

bandwidth (600,620,633,640,650 nm), Fig. 5.6(d) shows an average contrast of 6.15 × 10−6

when we use the filter extrapolation technique . Note that while the central wavelength of the

650 nm filter does not exactly reach 5% above our central wavelength, it has a relatively wide

bandwidth that reaches 558 nm at its FWHM. Starting at a contrast level of 1.0529× 10−4

over the full bandwidth, the controller reached a contrast limit of 1.842× 10−5.

Looking at the wavelength performance, we see that even the central wavelength is not

suppressed particularly well. The dark holes exhibit a good average contrast, but there is a

lot of variance within them and their edges are not well defined. Additionally, we see a rapid

degradation of the dark hole field as a function of wavelength to the point where it is virtually

indistinguishable when we reach the bounding wavelengths in the optimization at 600 and

650. Compared to a typical monochromatic experiment, these images depict an abnormally

102

550 600 632 660 700

10−5

10−4

Co

ntr

ast

λ

Pre-PCSM Extrapolation Results

AVG ContrastLeft Contrast

Right ContrastInitial Contrast

(a)

λ0/D

λ0/D

Pre-PCSM Full Bandwidth 1.84 × 10−5

−10 −5 0 5 10

−10

−5

0

5

10

log 1

0(C

ontrast)

−5

−4.8

−4.6

−4.4

−4.2

−4

(b)

λ0/D

λ0/D

Pre-PCSM 10% Mean Initial = 1.27 × 10−4

−10 −5 0 5 10

−10

−5

0

5

10

log 1

0(C

ontrast)

−5.6

−5.4

−5.2

−5

−4.8

−4.6

−4.4

−4.2

−4

−3.8

(c)

λ0/D

λ0/D

Pre-PCSM 10% Mean Contrast = 6.15 × 10−6

−10 −5 0 5 10

−10

−5

0

5

10

log 1

0(C

ontrast)

−5.6

−5.4

−5.2

−5

−4.8

−4.6

−4.4

−4.2

−4

−3.8

(d)

Figure 5.6: Pre-PCSM Extrapolated results

high amount of structure in the dark hole and appear to be highly sensitive to variance in low

to mid-spatial frequency aberrations. The chromatic dependence of these errors, particularly

at the shorter wavelengths, indicates that the 633 nm single mode fiber is inadequate for

the broadband experiments. This is a result of the multimode output (primarily TEM01

and TEM10 modes) at shorter wavelengths and the high degree of attenuation at longer

wavelengths. We chose to reproduce the results in this section after upgrading the fiber in

the laboratory.

103

5.2.2 Photonic Crystal Single Mode Fiber Upgrade

Given the non-single mode nature of the output beam at shorter wavelengths (and our

sensitivity to such aberrations), the poor coupling efficiency, and high attentuation of the

633 nm single mode fiber at longer wavelengths we chose to upgrade the fiber delivery to a

Koheras Photonic Crystal continuously Single Mode (PCSM) fiber. We chose a fiber option

that fully spanned the bandwidth we operate over, with a 5 µm core. This provided the

smallest mode field diameter available, 4.5 – 4.7 ± 0.5 µm, providing a numerical aperture

(NA) of ≈0.1 – 0.14 across the visible spectrum (NA being the sine of the divergence half-

angle). This is comparable to the 4.3 – 4.6 µm mode field diameter and NA 0.10 – 0.14 of

a single mode 633 nm fiber between 633 and 680 nm. Overall, the PCSM fiber has a lower

level of attentuation, is continuously single mode, and roughly matches the beam divergence

angle expected from the original fiber, which we have found in the past to well approximate

a point source. Since the field from a star is effectively planar, our ability to provide single-

mode light at all wavelengths allows us to more accurately demonstrate the controller under

conditions true to a real observation. Fig. 5.8 shows the overall results of applying the same

extrapolation technique after the new fiber had been installed. Fig. 5.8(a) shows marked

contrast improvement at all wavelengths, the out of band wavelengths improving on the order

of 3 × 10−5. As we would have expected, the shorter wavelengths improved more than the

longer wavelengths because their output no longer contains higher TEM modes. Comparing

Fig. 5.8(b) to Fig. 5.6(b) we also see that we have a slight improvement in the inner working

angle of the dark hole, which is consistent with the fact that we eliminated very low order

modes, such as TEM01 and TEM10, by upgrading to the new fiber. Very little of the energy

in Fig. 5.6(d) is (intentionally) below the cutoff wavelength for single mode output of the

633 nm SM fiber, which is why the IWA improvement is not as evident when comparing to

Fig. 5.8(d). Looking at the progression of the final dark hole in wavelength, Fig. 5.9, we see

that the central wavelength is deeply suppressed while the intensity of the dark hole raises

rapidly. For the filters inside the ≈ 10% optimization bandwidth (600, 620,640, and 650

104

nm) we see that the contrast degradation is a result of small scale aberrations growing in

intensity. Outside of these wavelengths the dark hole degrades rapidly to the point that it is

not distinguishable in the 550 and 740 nm images. While the average contrast does degrade

from the slight shift in the dark hole location, it is also due to speckles within the dark hole

increasing in intensity. This indicates that we are somewhat limited by the accuracy of our

extrapolation, which tends to introduce fine structure into the dark hole. Note, however,

that when comparing Fig. 5.9(e) with Fig. 5.7(d) we see that the fine structure at the central

wavelength is gone. This is entirely due to the fiber upgrade, since no other modifications

were made to the experiment.

105

The accuracy of the functional relationship of the phase and amplitude among wave-

lengths will ultimately bound the achievable bandwidth; therefore, as a metric, these results

are also compared to estimating each wavelength separately. As discussed in §2.5, improv-

ing this functional relationship requires that we establish a higher order relationship of the

electric field that captures more of the system model. For the time being, we compare the

performance of the simplest (and fastest) extrapolation technique we may physically mo-

tivate to multiple estimates, which will be slower but presumably more accurate at longer

wavelengths.

Fig. 5.10 shows the overall performance of multiple estimates vs. single estimates. When

estimating each wavelength separately the contrast reaches 5.67 × 10−6 in a ∼ 10% band

(Fig. 5.10(d)) and 1.364× 10−5 over the full bandwidth (Fig. 5.10(b)). There is no improve-

ment compared to the 5.48×10−6 contrast achieved in the 10% band and 1.298×10−5 contrast

over the full spectrum using the estimate extrapolation technique. Shaklan et al. [65] show

that the ultimate achievable contrast is a function of the correction bandwidth. They show

that this limitation is from propagation induced amplitude distributions in the field from

surface figure errors on the optics, and the fact that we have a finite controllable bandwidth

using two DMs in series (or a Michelson configuration). If we assume that the DM surfaces

(Fig. 1.10) are the worst figures in our system and apply this to the derivation in Shaklan

et al. [65], the HCIL optical system should be capable of reaching at least 1 × 10−6 over a

20% bandwidth, indicating that both methods are well above the fundamental limitations

of this optical system (Figs. 5.1(b), 5.10(a)) and these results are largely limited by higher

sensitivity to estimation error and system stability. Comparing the contrast as a function of

wavelength in Fig. 5.8(a) and Fig. 5.10(a), the bandwidth has been suppressed much more

uniformly when multiple estimates are used in lieu of the extrapolation technique. Since the

bounding wavelengths were only slightly underweighted in the optimization (µ = 0.75) we

expected a relatively uniform suppression as in Fig. 5.10(a). This indicates that the accuracy

of the extrapolation was the limiting factor in allowing the controller to evenly suppress the

106

bandwidth. However, the ultimate contrast of the central wavelength is not nearly as low in

the direct estimate as it was when applying multiple estimates. Comparing the dark holes at

the central wavelength using estimate extrapolation (Fig. 5.9(e)), we see that the dark hole

using direct measurements (Fig. 5.11(e)) exhibits much more residual structure. However,

Fig. 5.11(c)–Fig. 5.11(g) show that the region bounding the corrected area persists better

than the dark hole in the extrapolation case, Figs. 5.9(c)–5.9(g).

Since both reached roughly the same average contrast in the 10% band, we may have

fundamentally bottomed out the achievable contrast for symmetric dark holes (at the HCIL)

over that bandwidth. In other words, we can either have all five filters at a modest con-

trast level or we can have one wavelength highly suppressed at the cost of worse contrast

in the others. This would be related to the inherent chromaticity of our pupil from highly

aberrated, non-conjugate planes. This could be beyond the effective bandwidth achievable

using only two DMs in series. However, another distinct possibility is that we have reached

a stability limit in the experimental setup. Since we required three individual estimates to

achieve the results shown in Fig. 5.10, the estimation step took roughly three times longer

than in the extrapolation case. The low power of the filtered broadband light requires expo-

sure times of ≈40 seconds. With 8 exposures required per estimate using the batch process

estimator means that the estimation step went from ≈5 to ≈15 minutes per iteration. As

will be shown in §6.5, the system is only stable to ≈2 − 5 × 10−7 over such a long period

(independent of power fluctuations). Thus, the extrapolation method reached the limit of

system variance over a 5 minute interval at the central wavelength, but at the cost of less

accurate estimates over the bandwidth due to an innaccurate extrapolation. On the other

hand, the longer time frame required to take multiple estimates meant that we compromised

the stability of the experiment but we were able to more evenly suppress the field over the

bandwidth. As a result, we cannot prove that we have reached a fundamental limit in the

laboratory without getting more laser power or improving system stability. The sensitivity of

the correction algorithm to laboratory stability demonstrates the power of the extrapolation

107

technique. To take full advantage of an observatory’s stability, we clearly want to reduce

the time required to produce estimates of the electric field over the optimization bandwidth.

Furthermore, the advantage of establishing an augmented cost function and using extrapo-

lated wavelengths is that it automatically extends the optimal estimator developed in Ch. 4

to broadband light because this method only requires a single monochromatic estimate. It

is therefore worthwhile to continue pursuing more accurate and sophisticated extrapolation

techniques. The most promising direction we currently see is to augment the Kalman filter

to include the extrapolation. This potentially allows us to produce estimates of multiple

wavelengths using incomplete measurements at every wavelength. Thus, the estimator could

not independently produce an estimate at each wavelength but averages uncertainty in the

wavelength dependence across all three estimates.

5.3 Final Remarks

In this chapter we have demonstrated the Kalman filter estimator, the windowed stroke

minimization algorithm, an estimate extrapolation technique, and the ability to create sym-

metric dark holes via two DMs in series. The Kalman filter was tested against the original

DM-Diversity batch process method using the monochromatic Stroke Minimization algo-

rithm developed by Pueyo et al. [55]. The experiments show that the Kalman filter’s ability

to optimally estimate the field, balancing prior state estimate feedback with new measure-

ments dramatically boosts the efficiency of the estimation stage and stabilizes the estimate

at higher contrast levels. This efficiency is a function of both the number of exposures and

the exposure time. As it stands, the HCIL suppresses three orders of magnitude, which is

well within the dynamic range of our camera. As a result, exposure time does not affect the

efficiency of the results shown in this section. However, as we continue to higher contrast

levels (∼ 104 for a 16 bit camera) the exposure time at each iteration will also contribute

to the efficiency since we will have to change the exposure time. In this case state esti-

108

mate feedback will become even more important because the reliance on prior estimates will

reduce our dependence on extremely long exposure times at very high contrast levels. Over-

all, the results presented here show half the number of exposures required for estimation

without sacrificing achievable contrast or dark hole area. Additionally, thanks largely to

model improvements the convergence rate to the ultimate achievable contrast has increased

dramatically. The monochromatic wavefront suppression has matured greatly.

The Windowed Stroke Minimization algorithm is the first controller written so that it

explicitly suppresses a bandwidth, and the initial results are promising. The extrapolation

technique allows us to further improve the efficiency of the estimation stage by removing

the requirement that we obtain a field estimate for every wavelength in the optimization.

The ultimate achievable contrast in a 10% band is a little more than one order of magnitude

worse than the best performance demonstrated with the monochromatic algorithm, but this

is within a factor of two of the original symmetric dark hole results using monochromatic

light presented in Pueyo et al. [55]. There is still a great deal of work to be done to directly

optimize the bandwidth and improve the estimate extrapolation, but given the simplicity of

the physically motivated computations the results shown in §5.2.2 are very promising.

109

λ0/D

λ0/D

Pre-PCSM Extrapolate, λ = 550 nm, 2.4183e-05

−10 −5 0 5 10

−10

−5

0

5

10

log 1

0(C

ontrast)

−6

−5.5

−5

−4.5

−4

(a) λ = 550 nm

λ0/D

λ0/D


−10 −5 0 5 10

−10

−5

0

5

10

log 1

0(C

ontrast)

−6

−5.5

−5

−4.5

−4

(b) λ = 577 nm

λ0/D

λ0/D


−10 −5 0 5 10

−10

−5

0

5

10

log 1

0(C

ontrast)

−6

−5.5

−5

−4.5

−4

(c) λ = 600 nm

λ0/D

λ0/D


−10 −5 0 5 10

−10

−5

0

5

10

log 1

0(C

ontrast)

−6

−5.5

−5

−4.5

−4

(d) λ0 = 620 nm

λ0/D

λ0/D


−10 −5 0 5 10

−10

−5

0

5

10

log 1

0(C

ontrast)

−6

−5.5

−5

−4.5

−4

(e) λ = 633 nm

λ0/D

λ0/D


−10 −5 0 5 10

−10

−5

0

5

10

log 1

0(C

ontrast)

−6

−5.5

−5

−4.5

−4

(f) λ = 640 nm

λ0/D

λ0/D


−10 −5 0 5 10

−10

−5

0

5

10

log 1

0(C

ontrast)

−6

−5.5

−5

−4.5

−4

(g) λ = 650 nm

λ0/D

λ0/D


−10 −5 0 5 10

−10

−5

0

5

10

log 1

0(C

ontrast)

−6

−5.5

−5

−4.5

−4

(h) λ = 670 nm

λ0/D

λ0/D


−10 −5 0 5 10

−10

−5

0

5

10

log 1

0(C

ontrast)

−6

−5.5

−5

−4.5

−4

(i) λ = 694 nm

λ0/D

λ0/D


−10 −5 0 5 10

−10

−5

0

5

10

log 1

0(C

ontrast)

−6

−5.5

−5

−4.5

−4

(j) λ = 740 nm

Figure 5.7: Pre-PCSM Extrapolate Individual Filters

110

550 600 632 660 700

10−6

10−5

10−4

Contr

ast

λ

δ1 = δ2 = 0.75 Contrast Plot

AVG Contrast

Left Contrast

Right Contrast

Initial Contrast

(a)

λ0/D

λ0/D

Extrapolated Full Bandwidth 1.298 × 10−5

−10 −5 0 5 10

−10

−5

0

5

10

log 1

0(C

ontrast)

−5

−4.8

−4.6

−4.4

−4.2

−4

(b)

λ0/D

λ0/D

Extrapolated 10% Mean Initial = 1.02 × 10−4

−10 −5 0 5 10

−10

−5

0

5

10

log 1

0(C

ontrast)

−5.6

−5.4

−5.2

−5

−4.8

−4.6

−4.4

−4.2

−4

−3.8

(c)

λ0/D

λ0/D

Extrapolated 10% Mean Contrast = 5.48 × 10−6

−10 −5 0 5 10

−10

−5

0

5

10

log 1

0(C

ontrast)

−5.6

−5.4

−5.2

−5

−4.8

−4.6

−4.4

−4.2

−4

−3.8

(d)

Figure 5.8: Extrapolated results

111

λ0/D

λ0/D

Extrapolated, λ = 550 nm, 1.6352e-05

−10 −5 0 5 10

−10

−5

0

5

10

log 1

0(C

ontrast)

−6

−5.5

−5

−4.5

−4

(a) λ = 550 nm

λ0/D

λ0/D


−10 −5 0 5 10

−10

−5

0

5

10

log 1

0(C

ontrast)

−6

−5.5

−5

−4.5

−4

(b) λ = 577 nm

λ0/D

λ0/D


−10 −5 0 5 10

−10

−5

0

5

10

log 1

0(C

ontrast)

−6

−5.5

−5

−4.5

−4

(c) λ = 600 nm

λ0/D

λ0/D


−10 −5 0 5 10

−10

−5

0

5

10

log 1

0(C

ontrast)

−6

−5.5

−5

−4.5

−4

(d) λ = 620 nm

λ0/D

λ0/D


−10 −5 0 5 10

−10

−5

0

5

10

log 1

0(C

ontrast)

−6

−5.5

−5

−4.5

−4

(e) λ0 = 633 nm

λ0/D

λ0/D


−10 −5 0 5 10

−10

−5

0

5

10

log 1

0(C

ontrast)

−6

−5.5

−5

−4.5

−4

(f) λ = 640 nm

λ0/D

λ0/D


−10 −5 0 5 10

−10

−5

0

5

10

log 1

0(C

ontrast)

−6

−5.5

−5

−4.5

−4

(g) λ = 650 nm

λ0/D

λ0/D


−10 −5 0 5 10

−10

−5

0

5

10

log 1

0(C

ontrast)

−6

−5.5

−5

−4.5

−4

(h) λ = 670 nm

λ0/D

λ0/D


−10 −5 0 5 10

−10

−5

0

5

10

log 1

0(C

ontrast)

−6

−5.5

−5

−4.5

−4

(i) λ = 694 nm

λ0/D

λ0/D


−10 −5 0 5 10

−10

−5

0

5

10

log 1

0(C

ontrast)

−6

−5.5

−5

−4.5

−4

(j) λ = 720 nm

λ0/D

λ0/D


−10 −5 0 5 10

−10

−5

0

5

10

log 1

0(C

ontrast)

−6

−5.5

−5

−4.5

−4

(k) λ = 740 nm

Figure 5.9: Extrapolated Estimate Individual Filters

112

550 600 632 660 700

10−5

10−4

Contr

ast

λ

Contrast Plot, Multiple Estimates 10% Band

AVG Contrast

Left Contrast

Right Contrast

Initial Contrast

(a)

λ0/D

λ0/D

Direct Full Bandwidth 1.364 × 10−5

−10 −5 0 5 10

−10

−5

0

5

10

log 1

0(C

ontrast)

−5

−4.8

−4.6

−4.4

−4.2

−4

−3.8

(b)

λ0/D

λ0/D

Direct 10% Mean Initial = 9.83 × 10−5

−10 −5 0 5 10

−10

−5

0

5

10

log 1

0(C

ontrast)

−5.6

−5.4

−5.2

−5

−4.8

−4.6

−4.4

−4.2

−4

−3.8

(c)

λ0/D

λ0/D

Direct 10% Mean Contrast = 5.67 × 10−6

−10 −5 0 5 10

−10

−5

0

5

10

log 1

0(C

ontrast)

−5.6

−5.4

−5.2

−5

−4.8

−4.6

−4.4

−4.2

−4

−3.8

(d)

Figure 5.10: Direct Estimate results

113

λ0/D

λ0/D

Direct, λ = 550 nm, 1.9162e-05

−10 −5 0 5 10

−10

−5

0

5

10

log 1

0(C

ontrast)

−6

−5.5

−5

−4.5

−4

(a) λ = 550 nm

λ0/D

λ0/D

Direct, λ = 577 nm, 1.3395e-05

−10 −5 0 5 10

−10

−5

0

5

10

log 1

0(C

ontrast)

−6

−5.5

−5

−4.5

−4

(b) λ = 577 nm

λ0/D

λ0/D

Direct, λ = 600 nm, 6.3108e-06

−10 −5 0 5 10

−10

−5

0

5

10

log 1

0(C

ontrast)

−6

−5.5

−5

−4.5

−4

(c) λ = 600 nm

λ0/D

λ0/D

Direct, λ = 620 nm, 4.8845e-06

−10 −5 0 5 10

−10

−5

0

5

10

log 1

0(C

ontrast)

−6

−5.5

−5

−4.5

−4

(d) λ = 620 nm

λ0/D

λ0/D

Direct, λ = 633 nm, 4.0046e-06

−10 −5 0 5 10

−10

−5

0

5

10

log 1

0(C

ontrast)

−6

−5.5

−5

−4.5

−4

(e) λ0 = 633 nm

λ0/D

λ0/D

Direct, λ = 640 nm, 6.5068e-06

−10 −5 0 5 10

−10

−5

0

5

10

log 1

0(C

ontrast)

−6

−5.5

−5

−4.5

−4

(f) λ = 640 nm

λ0/D

λ0/D

Direct, λ = 650 nm, 6.6701e-06

−10 −5 0 5 10

−10

−5

0

5

10

log 1

0(C

ontrast)

−6

−5.5

−5

−4.5

−4

(g) λ = 650 nm

λ0/D

λ0/D

Direct, λ = 670 nm, 1.4555e-05

−10 −5 0 5 10

−10

−5

0

5

10

log 1

0(C

ontrast)

−6

−5.5

−5

−4.5

−4

(h) λ = 670 nm

λ0/D

λ0/D

Direct, λ = 694 nm, 1.777e-05

−10 −5 0 5 10

−10

−5

0

5

10

log 1

0(C

ontrast)

−6

−5.5

−5

−4.5

−4

(i) λ = 694 nm

λ0/D

λ0/D

Direct, λ = 720 nm, 2.4958e-05

−10 −5 0 5 10

−10

−5

0

5

10

log 1

0(C

ontrast)

−6

−5.5

−5

−4.5

−4

(j) λ = 720 nm

λ0/D

λ0/D

Direct, λ = 740 nm, 4.0198e-05

−10 −5 0 5 10

−10

−5

0

5

10

log 1

0(C

ontrast)

−6

−5.5

−5

−4.5

−4

(k) λ = 740 nm

Figure 5.11: Direct Estimate Individual Filters

114

Chapter 6

Sources and Propagation of Error

To fully understand the performance of the estimation and control algorithms demonstrated

in Ch. 5, we must identify the sources of error and determine how they affect the contrast

performance. Since this becomes an issue of accuracy and precision, it affects both the

ultimate achievable contrast and the rate of convergence. For example, in §5.1.2 the average

contrast in the dark hole for the first three iterations was 1.3934× 10−4, 5.303× 10−5, and

1.827 × 10−5. Within 30 iterations we reached a contrast floor of 2.3 × 10−7. We need to

understand what limited our ultimate achievable contrast, but we also need to understand

what limited our convergence rate. We did not have to re-linearize the controller to achieve

this contrast level, so presumably we should have been able to control to a greater level of

precision in the initial iterations to achieve our ultimate contrast in fewer iterations. We

need to understand what limited the accuracy and precision of our estimate at 1.3934×10−4

so that we only reached a contrast level of 5.303 × 10−5 in the first control step, rather

than being able to get close to our ultimate achievable contrast in a single iteration of the

correction algorithm. In this chapter we will investigate experimental limitations that affect

our ultimate achievable contrast, as well as those that will limit our performance in a single

iteration. These limitations can either be model dependent or a limitation of the actual

experiment. To maximize the efficiency of the controller, we would eventually like to reach

115

a point where each iteration is limited only by the precision of the experiment.

6.1 Precision of a Contrast Measurement

Throughout this thesis we have measured the intensity distribution in the image plane using

dimensionless units, that we commonly refer to as contrast because the values normalized to

the peak value of the star’s PSF. In this section we will develop how we calibrate the image

by normalizing the peak of the PSF, and discuss the implications of such a normalization on

the estimator when the intensity is discretized by the detector. Knowing the detector bias,

b, and the peak counts, qp, of PSF’s main lobe, the contrast of any pixel with q counts in a

single image is given by

10−C =(q − b)(qp − b)

. (6.1.1)

The value of q and qp assume a specific exposure time, t, and pixel dimensions, (dx, dy). By

finding the peak count rate per unit time per unit area,

I00 =(qp − b)t dx dy

, [counts/sec/m2/contrast] (6.1.2)

we can evaluate the normalization constant for arbitrary exposure time and pixel binning.

Thus, the contrast of any pixel in the image with q counts for an image with arbitrary

exposure time and effective pixel dimension, (dx, dy), is

10−C =(q − b)t dx dyI00

. (6.1.3)

Assuming we do not wish to change our spatial sampling of the image plane, (dx, dy) is fixed

and we can simply define our normalization as

I0 = I00 dx dy. (6.1.4)

116

Knowing how we evaluate the contrast in a single image, we want to understand the

precision with which we can measure the contrast at any point in the image plane. We

quantify this as the contrast value per count on the detector, given by

10−C/(q − b) =1

tI0

. [contrast/count] (6.1.5)

We define this as the contrast resolution of the image. Looking at Eq. 6.1.5 we see that

the contrast value is a function of integration time and the normalization constant. For a

fixed input power from the light source, I0 will not change. Thus, for a fixed pixel area our

contrast resolution will be limited by the detector bias, the number of bits in the DAC, and

our exposure time. The bias and number of bits are ultimately fixed quantities for a given

detector, so integration time is our only variable in determining the contrast resolution. From

Eq. 6.1.5, we see that by maximizing our exposure time we can evaluate the best possible

contrast resolution,

10−Cmin/q =1

tmaxI0


To evaluate our best contrast resolution, we must first calculate the maximum allowable

exposure time. This is not as obvious as it would first appear. Our first requirement is that

we do not saturate the detector. We choose an exposure time that gets the pixel with the

worst contrast, 10−Cmax , as close to the maximum allowable counts, (qmax − b), as possible

without overshooting. However, we would also like the pixel with the highest contrast in our

estimation area, 10−Cestmax , to stay within the linear response of the detector. This imposes

a ceiling of (qestmask − b) counts on this pixel, which is lower than (qmax − b). Thus, our

maximum allowable exposure time will be

tmax = min

[(qmax − b)I010−Cmax

,(qestmax − b)I010−Cestmax

]. (6.1.7)

To find the contrast resolution in a single image, we consider the contrast measured per

117

count. Having found tmax, this resolution is defined as

10−Cmin/q =1

tmaxI0


Applying Eq. 6.1.7 to Eq. 6.1.8 we find that the contrast resolution for a single exposure is

10−Cmin/q = max

[10−Cmax

qmax − b,

10−Cestmax

qestmax − b

]. (6.1.9)

To decide which contrast value is the limiting factor, we must look to the control history.

In the early stages of wavefront control 10−Cestmax will be larger than at later iterations, and

may compete as the limiting factor on exposure time simply because the controller has not

suppressed it yet. In a classical correction scheme, where we allow the field to be suppressed

at each iteration, this value will continue to decrease and will not be the limiting factor on

tmax. Likewise, it is at these higher contrast levels that we become more concerned with the

precision of our estimate since it will become a more significant fraction of the measured

intensity. Thus, in this correction scenario we are more concerned with the peak contrast

value, 10−Cmax , and how it limits the exposure time. As an example, the 16-bit camera used in

the Princeton HCIL has 65, 536 counts with a bias of approximately 3, 380 counts. In a typical

experiment at the Princeton HCIL, the peak contrast level in the aberrated field is limited to

approximately 1×10−3 and I00 is approximately 107000 counts/sec/m2/contrast with 2 mW

of laser power. When we probe the field we can conservatively expect the perturbation to

double this value, particularly in earlier iterations where the probes have greater amplitude.

To stay well away from saturation even in the presence of noise fluctuations, this peak

contrast level should not exceed 60, 000 counts. As a result, the maximum allowable exposure

time is ≈ 280 ms, making the best possible contrast resolution ≈ 3×10−8. Fortunately higher

levels of precision are possible simply by frame adding. Neglecting read-noise and detector

dark current, the maximum allowable counts can simply be multiplied by the number of

exposures. The lowest achievable detection limit is then limited by read noise. This has

118

been considered in great detail by Savransky [62], and this work even points towards a

requirement that we use a photon counting detector in a true planet finding mission. The

best contrast value measured in symmetric dark holes at the HCIL is 2.3 × 10−7. Even at

these contrast levels the precision of our measurement has become a significant percentage

of the measured value. At the Princeton HCIL we actively purge warm nitrogen over a

thermoelectrically cooled CCD so that we do not require a window on the detector. Since

we have applied a convective heating source to the detector it reaches thermal equilibrium at

a higher temperature. The constant flux problem also means that the thermoelectric coolers

cannot stabilize the temperature as well. The net result is that the residual detector noise

will vary from image to image by approximately 10 counts (see Fig. 4.2). Thus, both the

precision of a single image and the variance in the detector background for a single image

are very near our minimum achievable contrast and we find that we must average at least

two frames to maintain this contrast level.

It should be noted that the maximum exposure time is not necessarily recommended.

Perturbations due to model error or read error can push the maximum value beyond its

allowable limit, so a factor of safety should be applied in the exposure time. If the read noise

and additional readout time can be tolerated, it is always better to read out more frequently

and average images as a way of mitigating risk in acquiring the field. Additionally, if we

consider a correction scheme as in Ch. 4 where we do not suppress the field as we increase

our precision in the estimate the choice made in Eq. 6.1.7 to find tmax is not as clear since

the contrast level in the correction area does not go down as we estimate the field. This

decision will have to be made dynamically since it will be a function of whether the exposure

is intended for estimation or measuring the control effect. In the end, which term dominates

the maximum allowable exposure time when 10−Cestmax is still large will be a function of

the quality of the alignment since this will shift energy into the low working angles we are

concerned with. If our model is good enough, it is the maximum allowable exposure time

that will fundamentally bound the achievable contrast in a single iteration.

119

6.2 Estimation Algorithms and Propagating Error

Inaccuracies in the system model is the largest source of error in the experiment. These

typically are a result of approximations or measurement error of critical physical parameters.

Model error affects both the estimator and the controller, doubling its effect on the total

error. Specifically, these model errors will induce errors in the control effect matrix, Eq. 2.1.14

, and the observation matrix, Eq. 3.3.5. One critical constant used in all of our transforms

is the value of λf/D, the scaling parameter that appears in the exponential of the Fourier

transform, Eq. 1.5.25. An error in this value has one of the most dramatic effects, because

beyond producing the incorrect transform it also affects which pixels we sample for the dark

hole. If the value of λf/D is off by the physical dimension of a single pixel we will sample

the image plane incorrectly in the estimate and our control effect will appear in regions

unpredicted by the model. Since we can generally have a very high level certainty in D

and λ, this error budget is usually left to uncertainty in the effective focal length of the

system. Note that even sub-pixel shifts can have an effect on the estimate because the DM

commands produce a continuous distribution and energy will unintentionally be shifted from

the edge of one pixel into another during estimation. For this reason, we find that knowing

λf/D to within a millimeter or less is very important to minimize the impact of model error

on the estimation and control. Fortunately, such a dramatic response to this error means

that we have the sensitivity to measure this properly and calibrate it by applying sinusoidal

functions to the DMs. Two model errors that are a bit more tenacious are uncertainty in

the propagation distance and uncertainty in the DM shape.

An error in the propagation distance between a DM and the pupil affects the phase of

the transfer function used in the estimation and control algorithms. If we evaluate the DM

transfer function in Eq. 1.6.27, incorrectly assuming that the propagation distance is z, which

has an error of δz. Thus, for the true distance, z, we describe the measurement error as

z = z + δz. The computed transfer function with the incorrect propagation distance, C{·},

120

for the DM actuation is

C{Aφ} = e−i πz

λf2 (x2+y2)F{Aφ} (6.2.1)

= e−iπ(z+δz)

λf2 (x2+y2)F{Aφ}

= e−i πδz

λf2 (x2+y2)e−i πz

λf2 (x2+y2)C{Aφ}. (6.2.2)

Eq. 6.2.2 shows that we introduce an additional phase perturbation to the DM transfer

function that scales quadratically with working angle. Since we rely on this model to estimate

the field via probe shapes, Eq. 4.1.20, this perturbation also appears in the electric field

estimate. Decomposing the DM command into spatial frequencies of period m in the u

direction and n in the y direction with arbitrary phase shift θ, we find that

F{φ} = F{

cos

(2π

D(m0u+ n0v) + θ

)}(6.2.3)

=1

2F{ei(

2πD

(m0u+n0v)+θ) + e−i(2πD

(m0u+n0v)+θ)}

=1

2eiθδ

(mD− m0

D

)+

1

2e−iθδ

( nD− n0

D

). (6.2.4)

Thus, there is a one to one correspondence between the value of the induced phase perturba-

tion at a specific location in the image plane with the value of the phase shift for a particular

spatial frequency in the DM plane. In other words, we can correlate the phase perturbation

from δz to a commanded spatial frequency by equating

eiθ(x,y) = e−i πδz

λf2 (x2+y2)(6.2.5)

Looking back to Eq. 6.2.4, the perturbation in the phase shifts at the DM plane will scale

quadratically with spatial frequency, reducing the certainty of higher order correction. While

this will be less significant at small angular separation, the higher uncertainty at large spatial

121

frequencies will ultimately limit the achievable outer working angle of the dark hole. Since

the error in phase shift will be applied at both the estimation and control steps, the phase

errors induced on each spatial frequency of the DM command will be double the value in

the exponential of Eq. 6.2.2. Fig. 6.1 shows the phase picked up per millimeter of error in

the measurement of z. Since this phase scales quadratically, the phase error becomes fifteen

times worse between 4λ/D, the IWA of the Ripple3 coronagraph, and 16λ/D, the maximum

controllable angle of the Kilo-DM©. Since propagation distance of the DMs in the HCIL

make the phase contribution on the order of 2π this error is not noticeable at very low

contrast levels, making it a much more subtle error than choosing the wrong focal length.

Physically, we can think of the DM command as being chosen to conjugate the field at a

λ0/D

λ0/D

IP Phase per mm of Propagation Error

−10 −5 0 5 10

−10

−5

0

5

10

Phase (

rad)

1

2

3

4

5

6

7

8

x 10−3

Figure 6.1: An error in our knowledge of the propagation distance from the DM to the pupilinduces, to first order, a phase error that scales quadratically with working angle. The figureshows the phase error across the image plane per mm of error in the measurement of thepropagation distance.

particular pixel in the image plane. With an error in z, any particular spatial frequency will

arrive at the correct pixel on the detector, but the command won’t conjugate the field as

well as predicted in the model. Since this error scales quadratically with spatial frequency,

it becomes less significant as we try to achieve lower inner working angles. It will however

become an issue when we try to increase our discovery space by pushing the outer working

122

angle closer to the controllable limit of the DMs.

A second source of model error is the DM shape. We must recognize that while the

superposition model applied for the estimation and control algorithms in Ch. 2, Ch. 3, and

Ch. 4 is very useful, the true DM shape is nonlinear and subject to a plate equation. This

is very difficult to model, and we mitigate this by remaining in a low stroke regime where

the assumption of superposition and linearity is valid. There is also an inherent uncertainty

in the model of the influence function shape. If the full width half max of a single actuators

influence function is incorrect, then the spatial frequencies we apply will not be what is

modeled. If we approximate a sine wave by pushing and pulling adjacent actuators, then

the period of this oscillation is dependent on the full-width-half-max of the two actuators.

Thus, for a fixed pupil we will assume the incorrect spatial frequency and this will cause

energy to be distributed in areas we did not intend. The true influence function tends not

to be a simple gaussian either, having little wings towards the edges of the actuator. This

produces a roughly additive error at mid to high spatial frequencies, but they are of very low

amplitude. Thus this effect is not seen until we reach very high contrast levels, where the

amplitude of the perturbations from prior iterations becomes the same order as the control

amplitudes being attempted. Finally, our model assumes a particular actuator gain which

is itself a nonlinear response (at high stroke) and subject to a large amount of variation. If

we continue to adhere to the model of superposition and stay in the low stroke regime, we

may treat gain uncertainty as additive errors across the DM plane. We describe the modeled

DM actuation, φ, as the sum of the true actuation, φ, and an additive perturbation present

in the model, δφ. Since δφ is the result of a gain mismatch, the perturbations in the model

will follow the sign of the DM actuation, allowing us to write the model of the differential

123

measurement described by Eq. 3.3.1 as

I+ − I− = 4

[<{iC{Aφ}} ={iC{Aφ}}

]<{C{A}+ C{Ag}}

={C{A}+ C{Ag}}

(6.2.6)

= 4

[<{iC{A(φ+ δφ)i}} ={iC{A(φ+ δφ)}}

]<{C{A}+ C{Ag}}

={C{A}+ C{Ag}}

. (6.2.7)

Since the model assumes that there is an additional component contributing to the intensity

measurements, we find that the observation matrix is

H = H + δH, (6.2.8)

indicating that the model artificially changes the mapping between the electric field and the

observation, I+ − I−. Each element of δH will make the model’s observation matrix, H,

predict the incorrect response of z to the field’s interaction with that particular probe shape.

As a result, there is a spatially varying energy mismatch in the image plane between the

estimated and true states due to uncertainty in the actuator gain. If the gain variations

are random, then the effect on any particular spatial frequency is different. However, a

systematic error in the gain model can affect some spatial frequencies more than others.

For example, if we command a 16 cycle per aperture oscillation across the kilo-DM every

actuator is being pulled in an opposing direction to its adjacent actuators. However, if we

command an 8 cycle per aperture oscillation across the DM we end up with pairs of pixels

pulling in the same direction. There are some measurements from Boston Micromachines

that show increases in the voltage-to-amplitude gain when multiple actuators are pulling in

concert. Thus, we would underpredict the actuator gains for 8 cycles/aperture compared

to the gain for a 16 cycle/aperture command. Fortunately, the interferometric data shown

in Fig. 6.2 indicates that in the low stroke regime, the mathematical superposition of two

adjacent actuators very closely replicates pulling the two at the same time. Thus, the additive

124

56V

56 V

0 V

Applie

d V

oltage

−2 −1 0 1 2−200

−156.62−145.99

−108.96−102.18

010

Actuator Width (mm)

Heig

ht C

hange (

nm

)

Height Change of Adjacent Actuators on DM2

One Actuator

Two Actuators

Superposition

Two − One

Figure 6.2: Using a white light interferometer at Boston Micromachines, we have beenprovided with the following superpositon data for the kilo-DMs used in the Princeton HCIL.For a stroke that far exceeds those used in the control experiment, there is only a 10.6 nm(∼ 7%) discrepancy between superposing two actuators and pulling the two together.

perturbation, δH, will not be a function of spatial frequency (which would make computing

its effect much more complicated). Instead, we expect variability in actuator gain to be the

most common source of error; the effect of gain mismatch in the model will be relatively

random instead of dependent on the spatial frequencies being actuated.

94 V

56 V

0 V

Applie

d V

oltage

−1 −0.5 0 0.5 1−120

−107.3

0

53.8

70

Actuator Width (mm)

Heig

ht C

hange (

nm

)

DM1 Relative Actuation From 56V offset

X direction

Y direction

(a) DM1 Actuator Shapes

94 V

56 V

0 V

Applie

d V

oltage

−1.5 −1 −0.5 0 0.5 1 1.5−120

−102.5

0

165.8

185

Actuator Width (mm)

Heig

ht C

hange (

nm

)

DM2 Relative Actuation From 56V offset

X direction

Y direction

(b) DM2 Actuator Shapes

Figure 6.3: Using a white light interferometer at Boston Micromachines, we have beenprovided with the following data for the kilo-DMs used in the Princeton HCIL. The dataindicates that not only is the influence function perturbed from a gaussian, but the shape isa function of x and y. Additionally, the actuator gain is a function of whether the actuatoris being released or pulled from DC offset applied to the mirror.

125

The last systematic gain error that we must worry about is demonstrated in Fig. 6.3.

In the low stroke regime, we have a limited amount of data that would indicate that the

gain will be relatively constant regardless of the behavior of adjacent actuators. However,

we also see that this gain changes based on whether we are releasing or pulling (“poking”

or “pulling”) the actuator. This is most likely due to the internal stresses in the face sheet,

which is intentionally introduced by the manufacturer to force the DM surface to a flatter

nominal shape. Knowing approximately what these gains are, the control model actually

computes a new gain matrix on the fly, changing the gain based on whether it is a push

or a pull at that control iteration. This resulted in a small amount of improvement in the

convergence rate and ultimate achievable contrast in the laboratory.

6.3 Accuracy of Wavelength Extrapolation

As discussed in §2.5, the extrapolation method used for broadband control assumes that

the amplitude variations across the pupil are wavelength independent, completely neglect-

ing the perturbations from surfaces non-conjugate to the pupil. The degree of accuracy is

completely dependent on the surface flatness of the DMs, particularly at low to mid-spatial

frequencies. Since the phase errors across the mirror act as small focusing elements as the

light propagates towards the pupil, amplitude variations from these surfaces will be highly

chromatic. Since we have confined our wavelength assumption to an estimate extrapolation,

rather than building it into the control law, we will see poorer contrast performance at our

bounding wavelengths. This will tend to drive the central wavelength lower than the bound-

ing wavelengths (under equal weighting in the controller) and result in higher actuation levels

from the DMs. The effect of this extrapolation can be minimized either by attempting to

add higher order terms or attempting to construct a sensing scheme by which we produce a

polynomial describing the wavelength dependence of the pupil plane.

126

6.4 DM Controllable Space

In section 6.5 the issue of error in the deformable mirror model was addressed. More advanced

learning and adaptive algorithms, the beginning of which is presented in this thesis, are

designed to mitigate the errors propagated through the control code from this. Whether or

not the desired shape is achievable is a separate issue. The inability to solve for arbitrary

shape will limit the achievable contrast and the area over which this can be achieved. For

example, the monochromatic subspace in the image plane cannot exceed more than nλ/D,

where n is half the number of actuators along the plane chosen. Additionally, pure sinusoids

cannot be actuated, and the end condition of the DM limits the slopes and achievable shapes

of the mirror. The distribution of actuators also comes into play, where a square array of

actuators cannot perfectly generate a shape with circular symmetry.

6.5 Experiment Stability and Laser Power

Disturbances in the system will eventually degrade a dark hole back to its original contrast

level (or near it). This will ultimately limit our achievable contrast at a fundamental level,

but to truly understand this we must understand the time scale of these disturbances. To

test this, we began by running the correction algorithm to generate symmetric dark holes

in the image plane. The DM commands were then fixed and we proceeded to measure

the contrast over a 500 minute period. Simultaneously we re-imaged a pupil plane with a

beam splitter so that we could normalize by the total power incident on the image. This

removes any variability in the source itself, but leaves behind any variability in the detector

gain, or potentially a rapid fluctuation in the laser power that was not captured by the

pupil image. Looking at Fig. 6.4(a), Fig. 6.4(b), and Fig. 6.4(c) we see vary little variation

in the structure of the dark hole over the 8 hour period that we measured the contrast

performance. To see the image variation, we look to the deviation in the average contrast

in the dark hole. Fig. 6.4(d) indicates the the contrast variance can become a significant

127

λ0/D

λ0/D

Initial Image

−10 −5 0 5 10

−10

−5

0

5

10

log 1

0(C

ontrast)

−6.5

−6

−5.5

−5

−4.5

−4

(a) Initial IP

λ0/D

λ0/D

t = 240 minutes

−10 −5 0 5 10

−10

−5

0

5

10

log 1

0(C

ontrast)

−6.5

−6

−5.5

−5

−4.5

−4

(b) IP after 4 hours

λ0/D

λ0/D

t = 480 minutes

−10 −5 0 5 10

−10

−5

0

5

10

log 1

0(C

ontrast)

−6.5

−6

−5.5

−5

−4.5

−4

(c) IP after 8 hours

0 100 200 300 400 480−6

−4

−2

0

2

4x 10

−7

Time (min)

Contrast

Variation

Absolute Variation From Initial Image

(d) Contrast Deviation vs. Time

0 100 200 300 400 480−6

−4

−2

0

2

4

6

8x 10

−7 Running Difference of Contrast

Time (min)

Contrast

Variation

(e) Running Difference vs. Time

Figure 6.4: With fixed DM shapes, the contrast of symmetric dark holes was measured overa 500 minute period after creating two dark holes. To remove laser power fluctuations fromthe measurement, a camera was placed in the pupil plane with a beam splitter to allowus to normalize the contrast by the variance of the total power incident on the pupil overtime. (a), (b), and (c) show the evolution of the dark holes in time. (d) shows the absolutevariance in time from the initial contrast. (e) shows the running difference from one imageto the next over time. Note that while we see significant variation at ∼ 1/10th the intensityof the dark hole over time, the visual difference is barely discernible.

fraction of our ultimate achievable contrast measured in the initial image, but over much

longer time scales than are required for correction. Fig. 6.4(e) shows the running difference

between measurements over the time history. Thus we see that with a fixed DM setting the

variance over a five minute interval can be of order 10−8−10−7, which means that we cannot

correct fast enough to “freeze” the disturbances below this. Thus we can reach 10−7 contrast

128

levels because we are capable of measuring and controlling on a timescale that is faster than

the five minute variation of the experimental disturbances at a level of 1×10−7, but to reach

contrast levels below 1× 10−7 we must improve the short term stability of the system. Since

imaging is by far the slowest component, particularly when we reach levels of high contrast,

the simplest solution is to increase the power of the laser so that the integration time, and

hence the time required for estimation and control, is reduced.

Interestingly, this speaks to the required stability for wavefront control for a real obser-

vatory. To fairly assess the flight readiness of a control system achieving high contrast, the

intensity of the light source must be scaled so that the contrast stability of the experiment

matches what we expect in the observatory. Laboratories such as the HCIL, HCIT, and the

AMES testbed use very large amounts of power to minimize the exposure time and build up

a high level of statistical certainty in their measurement. In this way they are able to push

the absolute limit of achievable contrast, but it does not speak to the controller performance

in a true observatory environment. Relating this back to §6.1, increasing laser power is

equivalent to increasing I00 so that we may decrease tmax and still achieve the same contrast

resolution. This actually points to a very interesting mapping problem where we should

pick the laser power based on the photon and sensor noise in the system, making everything

comparable to our expected observatory performance. At the Princeton HCIL, we will even-

tually be limited by the time required to take an exposure. For the Starlight express cameras

that we use, the readout time is approximately 0.6 seconds when reading out the full frame.

Thus, regardless of our computation time each pair of images taken for estimation requires

that the aberrations be stable for at least 1.2 seconds for the estimate to be as accurate as

possible to reach higher levels of contrast. There is a large body of work investigating the

stability requirements for a mission capable of imaging Earth-like planets [26, 27, 64, 27].

The more aggressive the coronagraph is, and the smaller the desired inner working angle,

the stricter these tolerances become. Parameters such as secondary alignment or low order

bending modes for the primary mirror (and its support structure) can have tolerances in the

129

range of nanometers to picometers [66].

6.6 Stability of Laser Power

Variance in the incident flux skews the measurement of contrast, which means that at some

level of contrast the variance in the laser power will dramatically affect the normalization

of the field. The primary cause of laser power fluctuation is from temperature variation

in the gain medium. These fluctuations affect the mean gain value, resulting in a power

fluctuation at the output of the oscillator. Since we can average random fluctuations over

time, we are only concerned with static drifts in the power level. Static drift in the output

power is mitigated with temperature feedback control on the gain medium of the laser. As

a result, we are not overly concerned with systematic power drift through the course of

control. However, we should keep in mind that while this dominates, temperature is not the

only possible source of power drift. In fact, the laser was picked based on its power stability

spanning the duration of control. We need to ensure that we are averaging exposures with an

adequate signal-to-noise ratio to eliminate any residual fluctuations in the laser power arising

from temperature fluctuations in the gain medium. Once high contrast values are reached

the variance in the normalization becomes significant and active normalization is required.

This motivates a system which measures the core of the light, which is reflected off the

focal plane mask. This allows for accurate measurement of contrast since the normalization

constant can be actively adjusted.

6.7 Final Remarks on Error

Overall, the errors limiting our correction performance come in two categories; model errors

which limit the ultimate achievable contrast and the convergence rate of the system, and

experimental limitations that tend to limit the ultimate achievable contrast. Ideally, we

would like to be limited by the system, not our model. The results from the first published

130

0 10 20 30 40 50 6010

−7

10−6

10−5

10−4

Contr

ast

Iteration

Comparing Contrast Performance

AVG Contrast 2009Left Contrast 2009Right Contrast 2009

Current AVG ContrastCurrent Left ContrastCurrent Right Contrast

Figure 6.5: The figure compares the ultimate achievable contrast and convergence rate insymmetric dark holes from the initial 2009 experiments to current laboratory performance.Since the limitations were largely model based, we see that both the ultimate achievablecontrast and the convergence rate improve.

symmetric dark hole experiment in Pueyo et al. [55] were a result of laboratory instabilities,

but the dominant errors were model based, one example being errors in the influence function

shape and voltage-to-actuation gains used to produce the control effect matrices. Fig. 6.5

compares the current laboratory performance with that of the original data set in Pueyo

et al. [55]. Both data sets use the monochromatic Stroke Minimization algorithm, but the

new data set employs an improved model and the Kalman filter estimation scheme. With

these improvements we are now capable of achieving the same contrast levels in nearly one

tenth the time. We have also pushed the ultimate achievable contrast by a full order of

magnitude, achieved it in half as many iterations, and used half the exposures per iteration

to do so.

131

Chapter 7

Conclusions and Future Directions

The purpose of the broadband control and estimation work presented in this thesis is to

improve the efficiency and robustness of wavefront correction for high contrast. The estimate

extrapolation techniques applied to the broadband controller enable us to suppress the field

using only a single estimate. This reduces the number of exposures simply by not requiring

estimates for every wavelength. By closing the loop on the state estimate, we exploited the

Kalman filter to reduce our requirement on the number of exposures taken to estimate the

field. The filter also made the estimate more robust to noise, since it is designed specifically

to optimally combine the prior history and update measurements. This was particularly

useful at high contrast levels, where sensor noise can dominate the measurement. Overall,

the combination of the Kalman filter estimator and the Windowed Stroke Minimization

controller to produce a more stable estimate and to reach the same contrast levels with one

sixth of the measurements that prior versions of estimation and broadband control would

have required. However, none of this work is capable of reducing the effect of model error

addressed in Ch. 6, nor does it directly minimize the number of exposures used to achieve a

particular contrast level. Having developed a closed loop state estimator, we now have the

framework to consider more advanced estimation and control schemes designed to do just

that. The following sections discuss ways in which we might address time varying physical

132

parameters that will induce model error and attempt to find a global optimization of the

estimation and control problem with regard to exposure time.

7.1 Parameter Adaptive Filtering

In Ch. 6 we discussed many limitations to the accuracy of correction. Among the worst was

DM model error, which cannot be eliminated by the estimation algorithms of Chapters 3 and

4 since the model error is built into the observation matrix. We do however, have one method

or recourse. We can attempt estimating the most critical physical parameters during control

and thereby produce an adaptive model for our controller. In truth, nothing would be better

than to measure the parameters as well as possible prior to attempting control so that they

need not be estimated in a potentially nonlinear and computationally expensive fashion. The

correction algorithms presented in this thesis will also undoubtedly suppress the aberrated

field more efficiently if they have the correct parameters from the start. However, we require

knowledge of many parameters to such a high level of precision that the observatory must be

considered as a dynamic system. While we may have a high level of certainty to begin with,

slow variations in the observatory (such as thermal fluctuations or vibrations due to station

keeping) will cause these parameters to drift. As a result we have no choice but to actively

measure their variation [66], which is the purpose of sensors such as the coronagraphic low

order wavefront sensor (CLOWFS) [32], or to estimate them along with the image plane

electric field.

7.2 Dual Controller

This thesis has focused a great deal on optimal estimators and controllers, each of which

is guaranteed to carry some form of global optimality. However, we have not in any way

guaranteed that we have reached our lowest achievable contrast with the smallest possible

number of exposures. The control algorithms in Ch. 2 minimize actuation, not images.

133

However they only truly require one image to measure the control effect and take a contrast

measurement, so this is already minimal. The Kalman filter estimator developed in Ch. 4

does leave a free parameter with regard to the number of exposures taken for estimation at

each control iteration. However, there is no minimization through the entirety of control and

there is a balance to be had. There is a balance to be had in the number of images taken

at a particular iteration of the estimator, but this has a consequence on the accuracy of the

estimate due to the averaging nature of the estimator. For example, if we use one set of

exposures per iteration to suppress the field to a desired level we may have been able to do

the same job with less than half the iterations if we had doubled our estimation exposures

at each iteration. The purpose of a dual controller is to minimize our requirements on the

control effect and the estimator at the same time. In other words, the dual controller uses

an optimal control law to decide whether it must perturb the field to gain more knowledge

or if it should suppress the field to achieve its target. By designing such an algorithm we

could potentially save on exposures required for correction, leaving more time available to

take science exposures. This in turn increases the likelihood of detection and makes spectral

characterization more efficient.

7.3 Including Alternate Sensors

Up to this point there has been an underlying assumption that we are working within the

stability range inherent to a space telescope. However, the concepts here are becoming more

applicable to ground-based coronagraphic efforts [50, 47, 8] as their AO systems improve.

For the purpose of this thesis, the assumption is that the AO system designed for atmo-

spheric correction is the most upstream sensor in the optical path and is operating at the

wavelength farthest from the final image plane camera. For this reason this data will be the

most difficult to incorporate into the update, but has the advantage of being the fastest up-

date with the most information about residual atmospheric turbulence. Given the stability

134

requirements laid out in Shaklan et al. [66] for a space telescope, we will assume that a low

order wavefront sensor will be present in both systems. As shown in Fig. 7.1, the high order

Figure 7.1: Schematic representing relative position of different wavefront sensors in theoptical path of a coronagraphic instrument.

wavefront sensor is upstream of the final focal plane stop in the coronagraph. We assume it

is taking measurements to correct variation in the optical system that degrade the quality

of the PSF. Since this measurement is taken in the collimated beam and is operating at a

shorter wavelength [50], it is unclear how well it can be applied to the focal plane estimation

algorithms presented in this thesis. However, the coronagraphic low order wavefront sen-

sor [32] takes a measurement by directly imaging the reflected starlight off the focal plane

mask, meaning that it operates at the same wavelength as the science channel. In a space

observatory this would be able to directly measure and compensate for thermal fluctuations

in large system optics. In a ground mission this would also include slow speed, high order,

residual atmospheric turbulence. This is the final wavefront measurement prior to the coro-

nagraphic channel, and therefore experiences the most common path with the science path.

This however, is still assumed to be taken at a different bandpass, but is necessarily at a

closer bandpass than the fast AO loop.

135

7.3.1 Establishing a Reference

To make a differential measurement, we must first establish a reference field with which

to base changes in the output of the CLOWFS. Since the focal plane estimator would be

considered a high precision periodic update to the estimate we must first calibrate the relative

changes to properly balance the two measurements. Rather than writing a linear Kalman

filter which updates the state as in Eq. 4.1.27, we will write our update as

xk(−) = Fk−1 (xk−1(+)) + Γk−1uk−1. (7.3.1)

We will now include a model based update that maps differential changes in the measure-

ments taken with a CLOWFS and higher order wavefront sensors to their image plane effect

via the time update Fk−1(xk−1). The accuracy of this model will be highly dependent on

the accuracy of the model transforming the measurement to the final image plane, as well

as the resolution and accuracy of the sensor itself. Our expectation is that this will be less

accurate for extrapolating to the current state than applying probe images, but will allow us

to operate on a much faster timescale. This will in turn allow us to take fewer exposures in-

tended to probe the field, making the estimate faster and more stable to large perturbations

in the field that would otherwise drive the probe model out of the linear regime.

7.3.2 Applying Reference to the Time Update

The focal plane techniques described in this paper are most effective when the estimation

and control steps can be applied within the timescale of the remaining aberrations. For this

reason, these techniques can only be applicable on ground-based telescopes if they are done

after a fast AO loop for atmospheric correction. Fortunately, the Kalman filter formalism

built into the estimation will allow us to account for time evolution of the aberrated field

via a time update on the previous state using the sensor data from these prior control loops.

These sensors are commonly introduced via a dichroic to avoid losing photons in the science

136

channel, which means that we must address the issue of non-common path and wavelength

at each of these channels in order to use this sensor data.

7.4 Bias Estimation

This thesis has focused on estimation schemes that rely on a linear model to say that we

can take differential measurements with conjugate DM settings to estimate the electric field.

However, there is a large incentive to eliminating the need to take pairwise measurements for

each observation in the estimate. By taking a single image per observation (with background

subtraction), we reduce our dependence on a linearization of the field to produce the estimate,

and we automatically reduce the required exposures for estimation in half. Additionally, the

planet itself is a source of bias in the image. Thus, if we include bias estimation we can

use the entire control history to disambiguate planets from residual speckles. Since the

bias is an estimate built up over time with inadequate signal-to-noise, we will still have to

take an exposure at high contrast to extract a high quality spectra. Fortunately, the bias

estimation allows us to identify candidates before we are required to take long exposures at

high contrast levels. With the disambiguation happening in closed loop, we are not only

using the correction time history to solve the identification problem but this also allows us

to adaptively modify the dark hole area and contrast level. This also relaxes the required

size of the dark hole at the highest contrast levels, relaxing the control tolerances once we

attempt detection.

7.5 Final Remarks

For a coronagraph to be a viable option for a space mission detecting and characterizing

Earth-like planets the wavefront control must be both effective and efficient. Even in a hy-

bridized scheme where an external occulter is used for spectral characterization, we must

rely on the coronagraphs ability to reach 10−10 contrast levels to achieve detection. Even

137

as a detection instrument the broadband performance of the wavefront control system will

drive the bandwidth available for detection, since it is the easiest way to reduce exposure

time in a photon limited observation. The time required to perform wavefront control will

also directly impact the number of targets visited, and hence the number of planets the

coronagraph is capable of detecting over a finite mission life. The overall efficiency of the

wavefront control system, i.e. how fast it can suppress the field to the required contrast

level, is also a function of the quality of the optical surfaces and the stability of the observa-

tory. Shaklan et al. [66] show that the observatory will require picometer levels of stability

in focus alone to maintain such high levels of contrast. We will never be able to beat the

fundamental limitations of the observatory, so the spirit of this thesis is to push the accuracy

and efficiency of the wavefront control algorithm to a level where the observatory variation

is the only limitation, rather than the controller, at any point in time. This will maximize

our time spent doing science, making the mission more cost effective. To that end, we have

used the HCIL to demonstrated new estimation and broadband control schemes. In these

experiments we have put just as much value on their convergence rate and effective use of

acquired data as we have the ultimate achievable contrast. We have shown that when the ex-

periment is model limited, improvements in ultimate achievable contrast automatically map

to faster convergence rates. The Kalman filtering and extrapolation techniques increase our

efficiency simply by reducing the estimators requirement on new exposures at each iteration.

Eventually, these improvements will make the estimator effective for the stability level seen

at ground-based observatories. This will become the ultimate platform for demonstrating

the effectiveness of these focal plane wavefront control concepts for a space mission, where

the true stability is expected to be better than a ground observatory but will not be well

known.

138

Bibliography

[1] URL http://exoplanets.org/.

[2] JRP Angel and NJ Woolf. An imaging nulling interferometer to study extrasolar planets.

The astrophysical journal, 475:373, 1997.

[3] S.A. Basinger, L.A. Burns, D.C. Redding, F. Shi, D. Cohen, J.J. Green, C.M. Ohara, and

A.E. Lowman. Wavefront sensing and control software for a segmented space telescope.

In Proceedings of SPIE, volume 4850, pages 362–369. Citeseer, 2003.

[4] N.M. Batalha, W.J. Borucki, S.T. Bryson, L.A. Buchhave, D.A. Caldwell,

J. Christensen-Dalsgaard, D. Ciardi, E.W. Dunham, F. Fressin, T.N. Gautier, et al.

Kepler’s first rocky planet: Kepler-10b. The Astrophysical Journal, 729:27, 2011.

[5] R. Belikov, A. Give’on, B. Kern, E. Cady, M. Carr, S. Shaklan, K. Balasubramanian,

V. White, P. Echternach, M. Dickie, J. Trauger, A. Kuhnert, and N. J. Kasdin. Demon-

stration of high contrast in 10% broadband light with the shaped pupil coronagraph.

Proceedings of SPIE, 6693:pp. 66930Y–1 – 66930Y–11, September 2007.

[6] R. Belikov, E. Pluzhnik, M.S. Connelley, F.C. Witteborn, T.P. Greene, D.H. Lynch,

P.T. Zell, and O. Guyon. Laboratory demonstration of high-contrast imaging at 2 λ/d

on a temperatre-stabilized testbed in air. In Proceedings of SPIE, volume 7731, page

77312D, 2010.

[7] R. Belikov, E. Pluzhnik, F.C. Witteborn, T.P. Greene, D.H. Lynch, P.T. Zell, and

139

http://exoplanets.org/

O. Guyon. Laboratory demonstration of high-contrast imaging at inner working angles

2 λ/d and better. In Proceedings of SPIE, volume 8151, page 815102, 2011.

[8] J.L. Beuzit, M. Feldt, K. Dohlen, D. Mouillet, P. Puget, F. Wildi, L. Abe, J. Antichi,

A. Baruffolo, P. Baudoz, et al. Sphere: a planet finder instrument for the vlt. In

Proceedings of SPIE, volume 7014, page 41, 2008.

[9] Celia Blain, Rodolphe Conan, Colin Bradley, Olivier Guyon, and Curtis Vogel. Char-

acterisation of the influence function non-additivities for a 1024-actuator mems de-

formable mirror. Proceedings of the 1st AO for ELT conference, 01 2010. URL

http://arxiv.org/abs/1001.5048.

[10] M.R. Bolcar and J.R. Fienup. Sub-aperture piston phase diversity for segmented and

multi-aperture systems. Applied Optics, 48(1):A5–A12, 2009.

[11] P.J. Borde and W.A. Traub. High-contrast imaging from space: Speckle nulling in a

low aberration regime. Applied Physics Journal, 638:488–498, February 2006.

[12] W.J. Borucki, D.G. Koch, G. Basri, N. Batalha, A. Boss, T.M. Brown, D. Caldwell,

J. Christensen-Dalsgaard, W.D. Cochran, E. DeVore, et al. Characteristics of kepler

planetary candidates based on the first data set. The Astrophysical Journal, 728:117,

2011.

[13] W.J. Borucki, D.G. Koch, G. Basri, N. Batalha, T.M. Brown, S.T. Bryson, D. Caldwell,

J. Christensen-Dalsgaard, W.D. Cochran, E. DeVore, et al. Characteristics of plane-

tary candidates observed by kepler. ii. analysis of the first four months of data. The

Astrophysical Journal, 736:19, 2011.

[14] G. Bryden, W. Traub, L.C. Roberts Jr, R. Bruno, S. Unwin, S. Backovsky, P. Brugarolas,

S. Chakrabarti, P. Chen, L. Hillenbrand, et al. Zodiac ii: debris disk science from a

balloon. In Proceedings of SPIE, volume 8151, page 81511E, 2011.

140

http://arxiv.org/abs/1001.5048

[15] A. Carlotti, R. Vanderbei, and NJ Kasdin. Optimal pupil apodizations of arbitrary

apertures for high-contrast imaging. Optics Express, 19(27):26796–26809, 2011.

[16] W. Cash. Detection of earth-like planets around nearby stars using a petal-shaped

occulter. Nature, 442(7098):51–53, 2006.

[17] D.J. Des Marais, M.O. Harwit, K.W. Jucks, J.F. Kasting, D.N.C. Lin, J.I. Lunine,

J. Schneider, S. Seager, W.A. Traub, and N.J. Woolf. Remote sensing of planetary

properties and biosignatures on extrasolar terrestrial planets. Astrobiology, 2(2):153–

181, 2002.

[18] J.R. Fienup. Phase retrieval algorithms: a comparison. Applied Optics, 21(15):2758–

2769, 1982.

[19] JR Fienup. Phase-retrieval algorithms for a complicated optical system. Applied Optics,

32(10):1737–1746, 1993.

[20] J.R. Fienup. Phase-retrieval algorithms for a complicated optical system. Applied optics,

32(10):1737–1746, 1993.

[21] A. Gelb, J. Kasper, R. Nash, C. Price, and A. Sutherland. Applied Optimal Estimation.

M.I.T Press, 1974. ISBN 0486682005.

[22] R.W. Gerchberg and W.O. Saxton. A practical algorithm for the determination of the

phase from image and diffraction plane pictures. Optik, 35:237–246, 1972.

[23] A. Give’on, B. Kern, S. Shaklan, D.C. Moody, and L. Pueyo. Broadband wavefront

correction algorithm for high-contrast imaging systems. Proceedings of SPIE, 6691:

66910A–1 – 66910A–11, 2007.

[24] A. Give’on, B.D. Kern, and S. Shaklan. Pair-wise, deformable mirror, image plane-

based diversity electric field estimation for high contrast coronagraphy. In Proceedings

of SPIE, volume 8151, page 815110, 2011.

141

[25] Joseph W. Goodman. Introduction to Fourier Optics. Roberts & Company, 2005.

[26] J.J. Green and S.B. Shaklan. Optimizing coronagraph designs to minimize their contrast

sensitivity to low-order optical aberrations. Optical Science and Technology, 2003.

[27] J.J. Green, S.B. Shaklan, R.J. Vanderbei, and N.J. Kasdin. The sensitivity of shaped

pupil coronagraphs to optical aberrations. In Proceedings of SPIE Conference on As-

tronomical Telescopes and Instrumentation, 5487, pages 1358–1367, 2004.

[28] T.D. Groff and N.J. Kasdin. Designing an optimal estimator for more efficient wavefront

correction. In Proceedings of SPIE, volume 8151, page 81510X, 2011.

[29] T.D. Groff, A. Carlotti, and N.J. Kasdin. Progress on broadband control and deformable

mirror tolerances in a 2-dm system. In Proceedings of SPIE, volume 8151, page 81510Z,

2011.

[30] O. Guyon. Phase-induced amplitude apodization of telescope pupils for extrasolar ter-

restrial planet imaging. Astron. Astrophys, 404:379, 2003.

[31] O. Guyon, J.R.P. Angela, D. Backmanc, R. Belikovc, D. Gaveld, A. Giveone, T. Greenec,

J. Kasdinf, J. Kastingg, M. Levinee, et al. Pupil mapping exoplanet coronagraphic

observer (peco). In Proc. of SPIE Vol, volume 7010, pages 70101Y–1, 2008.

[32] O. Guyon, T. Matsuo, and R. Angel. Coronagraphic low-order wave-front sensor: Prin-

ciple and application to a phase-induced amplitude coronagraph. The Astrophysical

Journal, 693:75, 2009.

[33] O. Guyon, E. Pluzhnik, F. Martinache, J. Totems, S. Tanaka, T. Matsuo, C. Blain,

and R. Belikov. High-contrast imaging and wavefront control with a piaa coronagraph:

Laboratory system validation. Publications of the Astronomical Society of the Pacific,

122(887):71–84, 2010. ISSN 0004-6280.

142

[34] O. Guyon, B. Kern, R. Belikov, S. Shaklan, A. Kuhnert, et al. Phase-induced amplitude

apodization (piaa) coronagraphy: recent results and future prospects. In Proceedings of

SPIE, volume 8151, page 81510H, 2011.

[35] S.B. Howell. Handbook of CCD astronomy, volume 2. Cambridge Univ Pr, 2000. ISBN

0521648343.

[36] N. J. Kasdin and D. A. Paley. Engineering Dynamics A Comprehensive Introduction.

Princeton University Press, 2011.

[37] N. J. Kasdin, R. J. Vanderbei, and R. Belikov. Shaped pupil coronagraphy. Comptes

Rendus Physique, 8:312–322, April 2007. doi: 10.1016/j.crhy.2007.02.009.

[38] N.J. Kasdin, R.J. Vanderbei, D.N. Spergel, and M.G. Littman. Extrasolar planet finding

via optimal apodized-pupil and shaped-pupil coronagraphs. The Astrophysical Journal,

582(2):1147–1161, 2003.

[39] N.J. Kasdin, A. Carlotti, L. Pueyo, T. Groff, and R. Vanderbei. Unified coronagraph

and wavefront control design. In Proceedings of SPIE, volume 8151, page 81510Y, 2011.

[40] Jason Kay. Electric Field Estimation for High-Contrast Imaging. PhD thesis, Princeton

University, 2009.

[41] C. Keller, V. Korkiakoski, N. Doelman, R. Fraanje, and M. Verhaegen. Extremely

fast focal-plane wavefront sensing for extreme adaptive optics. In Proceedings of SPIE,

volume 8447, page In Press, 2012.

[42] M.J. Kuchner and W.A. Traub. A coronagraph with a band-limited mask for finding

terrestrial planets. The Astrophysical Journal, 570(2):900–908, 2002.

[43] M. Levine, S. Shaklan, and J. Kasting. Terrestiral planet finder coronagraph science

and technology definition team (stdt) report, 2006. URL http://exep.jpl.nasa.gov/

TPF/STDT_Report_Final_Ex2FF86A.pdf.

143

http://exep.jpl.nasa.gov/TPF/STDT_Report_Final_Ex2FF86A.pdf

http://exep.jpl.nasa.gov/TPF/STDT_Report_Final_Ex2FF86A.pdf

[44] M. Levine, D. Lisman, S. Shaklan, J. Kasting, W. Traub, J. Alexander, R. Angel,

C. Blaurock, M. Brown, R. Brown, et al. Terrestrial planet finder coronagraph (tpf-c)

flight baseline concept. Arxiv preprint arXiv:0911.3200, 2009.

[45] M. Levine, R. Soummer, et al. Overview of technologies for direct optical imaging of

exoplanets. submitted to the Astro2010 technology development white paper call, 2009.

[46] R.G. Lyon, M. Clampin, P. Petrone, U. Mallik, T. Madison, M.R. Bolcar, M.C. Noecker,

S.E. Kendrick, and M. Helmbrecht. Vacuum nuller testbed (vnt) performance, charac-

terization and null control: progress report. In Proceedings of SPIE, volume 8151, page

81510F, 2011.

[47] B. Macintosh, M. Troy, R. Doyon, J. Graham, K. Baker, B. Bauman, C. Marois,

D. Palmer, D. Phillion, L. Poyneer, I. Crossfield, P.J. Dumont, B. M. Levine, M. Shao,

E. Serabyn, C. Shelton, G. Vasisht, J. K. Wallace, J. Lavigne, P. Valee, N. Rowlands,

K. Tam, and D. Hackett. Extreme adaptive optics for the thirty meter telescope. In

Proceedings of SPIE, volume 6272, pages 62720N–1 – 62720N–15. procee, 2006.

[48] B.A. Macintosh, J.R. Graham, D.W. Palmer, R. Doyon, J. Dunn, D.T. Gavel, J. Larkin,

B. Oppenheimer, L. Saddlemyer, A. Sivaramakrishnan, et al. The gemini planet imager:

from science to design to construction. In Proc. SPIE, volume 7015, pages 7015–43, 2008.

[49] F. Malbet, JW Yu, and M. Shao. High-dynamic-range imaging using a deformable

mirror for space coronography. Publications of the Astronomical Society of the Pacific,

107(710):386–398, 1995.

[50] F. Martinache, O. Guyon, V. Garrel, C. Clergeon, T. Groff, P. Stewart, R. Russell, and

C. Blain. The subaru coronagraphic extreme ao project: progress report. In Proceedings

of SPIE, volume 8151, page 81510Q, 2011.

[51] Y. Minowa, Y. Hayano, S. Oya, M. Watanabe, M. Hattori, O. Guyon, S. Egner, Y. Saito,

M. Ito, H. Takami, et al. Performance of subaru adaptive optics system ao188. In

144

Society of Photo-Optical Instrumentation Engineers (SPIE) Conference Series, volume

7736, page 122, 2010.

[52] C. Petit, J.M. Conan, C. Kulcsar, H.F. Raynaud, T. Fusco, J. Montri, and D. Rabaud.

Optimal control for multi-conjugate adaptive optics. Comptes Rendus Physique, 6(10):

1059–1069, 2005.

[53] L. Poyneer and J.P. Veran. Predictive wavefront control for adaptive optics with arbi-

trary control loop delays. JOSA A, 25(7):1486–1496, 2008.

[54] L. Pueyo and N. J. Kasdin. Polychromatic Compensation of Propagated Aberrations

for High-Contrast Imaging. ApJ, 666:609–625, September 2007.

[55] L. Pueyo, J. Kay, N.J. Kasdin, T. Groff, M. McElwain, A. Give’on, and R. Belikov.

Optimal dark hole generation via two deformable mirrors with stroke minimization.

Applied Optics, 48(32):6296–6312, 2009.

[56] L. Pueyo, S.B. Shaklan, A. Give’On, M. Troy, N.J. Kasdin, J. Kay, T. Groff, M. McEl-

wain, and R. Soummer. Correction of quasi-static wavefront errors for elt with two

sequential dms. In Adaptative Optics for Extremely Large Telescopes, volume 1, page

5009, 2010.

[57] L. Pueyo, N. Jeremy Kasdin, A. Carlotti, and R. Vanderbei. Design of phase induced

amplitude apodization coronagraphs over square apertures. The Astrophysical Journal

Supplement Series, 195:25, 2011.

[58] L. Pueyo, N.J. Kasdin, and S. Shaklan. Propagation of aberrations through phase-

induced amplitude apodization coronagraph. JOSA A, 28(2):189–202, 2011.

[59] Laurent Pueyo. Broadband contrast for exo-planet imaging: The impact of propagation

effects. PhD thesis, Princeton University, 2008.

145

[60] D. Redding et al. Wavefront sensing & control for a large segmented space telescope.

In Bulletin of the American Astronomical Society, volume 41, page 342, 2009.

[61] F. Rigaut. Ground-conjugate wide field adaptive optics for the elts. Beyond conventional

adaptive optics, 58:11–16, 2002.

[62] Dmitry Savransky. Exosystem Modeling for Mission Simulation and Survey Analysis.

PhD thesis, Princeton University, 2011.

[63] S. B. Shaklan and J. J. Green. Reflectivity and optical surface height requirements

in a broadband coronagraph. 1.Contrast floor due to controllable spatial frequencies.

Applied Optics, 45:5143–5153, July 2006.

[64] S.B. Shaklan and J.J. Green. Low-order aberration sensitivity of eighth-order corona-

graph masks. The Astrophysical Journal, 628:474, 2005.

[65] S.B. Shaklan, J.J. Green, and D.M. Palacios. The terrestrial planet finder coronagraph

optical surface requirements. Proceedings of SPIE, 6265:pp. 62651I–1 – 62651I–12, 2006.

[66] S.B. Shaklan, L. Marchen, J. Krist, and M. Rud. Stability error budget for an aggressive

coronagraph on a 3.8 m telescope. In Proceedings of SPIE, volume 8151, page 815109,

2011.

[67] D. N. Spergel. A new pupil for detecting extrasolar planets. astro-ph/0101142, 2000.

[68] R.F. Stengel. Optimal Control and Estimation. Dover Publications, 1994. ISBN

0486682005.

[69] J. Trauger, K. Stapelfeldt, W. Traub, C. Henry, J. Krist, D. Mawet, D. Moody, P. Park,

L. Pueyo, E. Serabyn, et al. Access: a nasa mission concept study of an actively

corrected coronagraph for exoplanet system studies. In Proceedings of SPIE, volume

7010, page 701029, 2008.

146

[70] S. Unwin and W. Traub. Zodiac: A balloon facility for exoplanet debris disk observa-

tions. In Proceedings of the conference In the Spirit of Lyot 2010: Direct Detection of

Exoplanets and Circumstellar Disks. October 25-29, 2010. University of Paris Diderot,

Paris, France. Edited by Anthony Boccaletti., volume 1, page 35, 2010.

[71] R. J. Vanderbei, E. Cady, and N. J. Kasdin. Optimal Occulter Design for Finding

Extrasolar Planets. The Astrophysical Journal, 665(1):794–798, 2007.

[72] R.J. Vanderbei. Fast fourier optimization, sparsity matters. Mathematical Programming

Computation, pages 1–17, 2012.

[73] R.J. Vanderbei and W.A. Traub. Pupil mapping in 2-d for high-contrast imaging.

Astrophysical Journal, 626:1079–1090, 2005.

147

Optimal Electric Field Estimation and Control for Coronagraphy

Documents

Transcript of Optimal Electric Field Estimation and Control for Coronagraphy